There’s a pattern I’ve observed across engineering teams that adopt AI coding tools: initial productivity gains, followed by a plateau, followed by a vague frustration that the tools “don’t really understand” the codebase.
The diagnosis is usually wrong. The tools aren’t failing to understand the codebase. They’re failing to understand the constraints that aren’t in the codebase — the business rules, the architectural decisions, the things that were decided before a line was written and never encoded anywhere a machine can read.
Spec-Driven Development is the practice of encoding those constraints before asking AI tools — or other engineers — to implement them.
What a spec is and isn’t
A spec is not a requirements document. Requirements describe what a system should do from a user’s perspective. A spec describes what an implementation should look like from an engineering perspective.
A spec includes:
- Data models with types and constraints
- API contracts (inputs, outputs, error cases)
- Behavioral invariants — things that must always be true
- What’s explicitly out of scope
- The decision made when there were multiple valid options
A spec is written before implementation. Not as part of it — before it. This is the discipline that most teams skip, and it’s the one that makes AI tools significantly more useful.
A real example
Here’s a spec I wrote before implementing a form submission service for a loan application:
# LoanApplicationSubmissionService
## Responsibility
Validates a completed loan application and submits it to the core banking API.
Does NOT handle partial applications or draft state — that's ApplicationDraftService.
## Input
LoanApplication: {
applicantId: string (UUID)
productCode: string (from LOAN_PRODUCT enum)
requestedAmount: number (positive, max 50_000_000 CLP)
termMonths: number (12 | 24 | 36 | 48 | 60)
applicantProfile: ApplicantProfile (see schemas/applicant.ts)
}
## Validation order
1. Schema validation via LoanApplicationSchema (Zod)
2. Business rule validation:
- Applicant must be 18+ and < 75
- requestedAmount must not exceed 5x monthly income
- No active applications in the last 90 days
3. Product eligibility check via ProductEligibilityService
## Output
Success: { submissionId: string, estimatedDecisionDate: string (ISO 8601) }
Failure: { code: ApplicationErrorCode, message: string, field?: string }
## Error codes
VALIDATION_FAILED — schema validation error (include field name)
BUSINESS_RULE_VIOLATION — business rule failed (include rule id)
PRODUCT_INELIGIBLE — applicant doesn't qualify for this product
CORE_API_UNAVAILABLE — upstream error (do not retry in this layer)
## Invariants
- Never commit to the database before core API responds
- Log every submission attempt with full input (sanitized) and outcome
- Do not expose core API error details to the caller
## Out of scope
- Draft saving
- Document collection
- Decision notification
When I handed this spec to Claude Code with “implement this service in TypeScript, following the architecture in src/application/”, the output was immediately usable. Not perfect — the error handling needed adjustment and one invariant was implemented incorrectly — but the structure was right, the types were right, and the review was fast because I had a clear specification to review against.
Without the spec, the same prompt produces code that works in the happy path and has no coherent error handling strategy, because error handling requires knowing which errors matter and why.
Writing specs for AI consumption
Specs written for human engineers can be imprecise because humans fill gaps with shared context. Specs for AI tools need to be explicit about things that seem obvious.
Enumerate the error cases. Humans infer “and handle errors appropriately.” AI tools implement the first error-handling pattern in their training data that seems relevant. If you want specific error codes with specific semantics, write them down.
Specify the ordering of operations. “Validate, then check eligibility, then call the API” looks obvious in the spec above. Without the ordering, the implementation might check eligibility before validating input — different error messages, different behavior under invalid input.
Declare what’s out of scope explicitly. If the service shouldn’t handle draft state, say so. Otherwise the implementation might add an isDraft parameter because it seems like a natural extension.
Include invariants. Invariants are the things that must remain true regardless of input. They’re rarely encoded in requirements, frequently missed in implementation, and expensive to discover in production.
The CLAUDE.md layer
Specs cover individual implementations. CLAUDE.md covers the project-level context that applies across all implementations: architecture patterns, conventions, what to avoid.
The two work together. CLAUDE.md provides ambient context (“we use Zod for validation, we don’t use any”); the spec provides implementation-specific context (“validate in this order, return these error codes”).
A workflow I’ve found effective:
- Write the spec in a
specs/directory before touching the code - Reference it in the prompt: “implement the service described in
specs/loan-submission.md” - Review the output against the spec, not against your mental model of what you wanted
- If the output deviates, update the spec to be more precise, then regenerate
Step 3 is the crucial part. Reviewing against a written spec is faster and more reliable than reviewing against your memory of intent. The spec externalizes the intent — which is where it should be.
The compounding effect
Specs don’t just make the first implementation faster. They make every subsequent change faster.
When requirements change six months later, you read the spec first. You understand what the implementation was protecting against. You update the spec to reflect the new constraints. Then you implement the change — or ask Claude Code to implement it — against an updated, accurate specification.
Without a spec, every change requires reverse-engineering intent from code. You read the implementation, try to infer why it’s structured the way it is, make assumptions, and sometimes get them wrong. The spec eliminates that step.
What this costs
Writing a spec takes time. For a service of moderate complexity, maybe two hours. That’s two hours not spent writing code.
The return: an implementation reviewable in thirty minutes instead of two hours, with a document that serves as the ADR for why the implementation looks the way it does.
For systems where correctness matters — financial services, anything with compliance requirements, anything that’s hard to test exhaustively — the spec pays for itself in the first significant debugging session it prevents.
The broader shift
Spec-Driven Development isn’t primarily an AI practice. It’s an engineering practice that becomes significantly more valuable in an AI-assisted workflow.
Before AI tools, specifications helped human engineers align before they built. That value was real but the overhead was high enough that many teams skipped it.
With AI tools, the spec becomes the primary interface between engineering intent and generated implementation. The quality of the spec directly determines the quality of the output. The overhead of writing a spec is now offset by the speed of implementation — and the alternative (generating without a spec and reviewing carefully) is often slower.
The engineers I work with who get the most out of AI tools are the ones who write the best specs. That hasn’t surprised me. Good specification is the core skill of senior engineering. It just has a faster consumer now.