The majority of testers I've surveyed describe AI output as noisy, incorrect, or requiring heavy rework. In workshops, I hear the same diagnosis every time: better prompting, more context, a different model. These are answers to the wrong question.
The correction loop starts before the prompt. It starts at the moment a team decides which task to hand to the AI, and does so without asking whether the task is well-defined enough for a machine to handle, or whether the team would recognise a good result if they saw one. When the use case is poorly chosen, the AI produces something plausible-looking and the human spends their time correcting output they cannot properly evaluate. That isn't humans and machines working together. That's humans cleaning up after a decision they didn't consciously make.
This workshop teaches a practical framework for making that decision deliberately. You'll work through your own real AI use cases using a structured diagnostic that draws on the skills you already have from test analysis and risk assessment. There is no slide-heavy instruction. The session is pair work, structured critique, and practice against real examples from your own work.
- An audit of at least two of your current AI use cases against explicit diagnostic criteria, with a clear view of which are worth continuing and which should be redesigned or dropped
- An explicit definition of "working" for one use case, specific enough to share with your team and act on the same week
- Language to explain your diagnostic reasoning to a stakeholder who believes the tool is the solution, without making it a debate about the tool
- A reusable, tool-agnostic framework that applies across AI tools, team sizes, and testing contexts