Define Good Before You Build
I spent enough of my career reading and writing statements of work to know what vague language does to a project. You can look at a scope document before a single person has been assigned to it and tell whether the project is set up to succeed or set up to argue about later. “Implement a recommendation engine to improve user engagement” is not a scope. It’s an aspiration. Somewhere in the next six months someone is going to spend real money finding out that “improve engagement” meant three different things to the three people who approved the SOW.
None of this is new. Vague scope has been killing projects since before any of us were in the industry. What’s new is that AI is giving teams a way to skip the fundamentals and feel good about it.
The pattern
The pattern works like this. Someone has an idea: what if we could use AI to do some task. They fire up a model, get an output that looks impressive, show it in a meeting, and suddenly there’s a new AI project with lots of expectations and hype attached to it. There’s a demo. There’s excitement. But the thing that should have happened before any of that never happens. Nobody stops to define the actual problem, who it’s for, what the success criteria are, or how you’d know it’s working. The demo becomes the justification. The excitement becomes the plan.
Exploration isn’t a POC
This is where the language matters. Teams often call this early work a proof of concept. It isn’t one. A proof of concept starts with at least a rough definition of a problem and tests whether a given technology can feasibly address it. What many AI projects are doing is exploration. They’re feeding a model some data or a vague prompt to see what it can do, but there’s no problem statement underneath any of it. Exploration is valuable. It shows you what’s possible. But it doesn’t validate anything, because there was nothing defined to validate against. Calling it a POC gives the project a false sense of maturity, as if a real question was asked and answered. The question “can this model produce something cool” is not the same question as “can this model solve the problem we need solved.”
The irony is that people skip the definition step thinking it will slow them down, and then they end up doing the definition work anyway. Just later and at a much higher cost. This is what I often see playing out. A team is mid-build on an AI project, weeks or months in, and they start realizing they don’t fully understand the boundaries of the problem they’re solving. What are the actual inputs? What are the edge cases? What does the process look like end to end? These are questions that belong on a whiteboard before anyone writes code. Instead, they surface during a sprint, when the cost of answering them includes rework, wasted cycles, and a team that’s already committed to an architecture that may not fit.
AI doesn’t change the fundamentals of building things. What AI does change is how easy it is to skip those steps.
You can’t actually avoid defining the problem. You can only choose whether you do it deliberately upfront or accidentally mid-build. One costs you days, the other quarters.
The underlying issue is that AI doesn’t change the fundamentals of building things. You still need to know what you’re building, for whom, what the success criteria are, and whether the environment can support it. What AI does change is how easy it is to skip those steps.
This matters more for AI than for traditional software because deterministic systems have a floor. You tell a function to add two numbers, and as long as the code is correct, it adds two numbers. AI doesn’t have that floor. The same input can produce different outputs, and quality varies along dimensions that aren’t always obvious. That variance makes the absence of a clear problem statement and success criteria worse. The more variance you have in what a system produces, the more you need to know what you’re measuring it against.
What good scope looks like
So concretely: “use AI to improve marketing” is not a plan. What specific output are you expecting? What defines a successful result? What does a bad one look like? What volume, what tone, what audience?
Compare that with: build an application that helps marketing draft emails using the company’s internal technical documentation as source material, with output reviewed before sending. Now you have source data, an acceptance step, a defined workflow, and enough structure that someone can build it, someone else can test it, and you can tell whether it works. The AI is a component inside a workflow that makes sense on its own terms.
Specific KPIs, not directions
The same specificity applies to the KPIs. “Improve customer satisfaction” is a direction, not a measurement. “Reduce average support ticket resolution time from four hours to two hours” is a KPI. You can build against it. You can test against it. You can tell six months in whether you hit it or not. Without that specificity, you end up retrofitting metrics after launch to justify the spend.
The cost of defining good before you build is a few days. A working session, some writing, some alignment with the actual users. The cost of not defining it is months of a team scrambling to justify a project that nobody can say succeeded or failed.
The question I want every AI project to answer before it starts is the one that sounds the most boring: what does good look like, specifically, in writing, with examples. If the team can answer that, the rest of the project has a chance.