Six Months with Claude Code: What Actually Changed

I’ve been using Claude Code as my primary development tool for roughly six months. Not occasionally, not as an experiment — as the default. Every feature, every debugging session, every refactor went through it.

The honest answer to “did it change how I work?” is yes, but not in the ways I expected. The productivity gains are real but narrower than the marketing suggests. The costs are real but different from the concerns people usually raise.

What actually changed

The biggest shift isn’t speed — it’s the ratio of reading to writing.

Before Claude Code, I wrote code and read it afterward to check if it was correct. With Claude Code, I write specifications (in natural language, in comments, in pseudocode) and read the generated code to check if it matches my intent. The activity is similar; the direction is reversed.

This sounds subtle. It isn’t. Writing code forces you to resolve ambiguity — you can’t leave “handle the error case” as a note to yourself, you have to implement it. Reading generated code lets you defer resolution of ambiguity, which means ambiguity accumulates. The code compiles, the tests pass, and three weeks later you discover that “handle the error case” was interpreted as “swallow the error silently.”

The engineers I know who get the most out of Claude Code are the ones who learned to write tighter specifications, not faster prompts.

Where it’s genuinely excellent

Boilerplate with constraints. Anything that follows a pattern I can describe precisely — REST API handlers, TypeScript interfaces derived from a schema, test fixtures, migration files — Claude Code does well. The gain isn’t just speed; it’s that the pattern is applied consistently. No “I forgot to add the error handler on this route” because the pattern includes it.

Cross-file refactoring. Renaming a type used in thirty files, updating an interface and propagating the changes to all implementations, moving a module and updating imports — these operations are tedious manually and error-prone because humans skip files. Claude Code doesn’t get bored.

Documentation I would have skipped. I’m not proud of this, but I documented things with Claude Code that I would have left undocumented on my own. The friction is low enough that “add JSDoc to this function” takes less time than deciding whether it’s worth doing.

Rubber duck debugging. Describing a problem in enough detail for Claude Code to help often reveals the problem before it responds. This is the Feynman technique as a side effect.

Where it underperforms expectations

Complex business logic with multiple constraints. When the logic requires holding several domain invariants in tension — the kind of code that senior engineers write slowly and carefully — Claude Code produces plausible-looking code that often misses a constraint. Not always. Often enough that I’ve stopped using it for this category without very careful review.

In financial systems, “plausible-looking but slightly wrong” is worse than “obviously broken.” Obvious breaks are caught in testing. Subtle logic errors in eligibility rules or interest calculations are caught in production.

Architectural decisions. Claude Code doesn’t know what a codebase needs to be in two years. It doesn’t know the team’s capabilities, the organization’s risk tolerance, or the regulatory constraints a solution has to navigate. It produces solutions for the narrow technical problem in front of it. The context that makes a solution actually good isn’t in the prompt.

Debugging novel production issues. For known error patterns, it’s useful. For “this works in staging, fails under specific load conditions in production, and the logs show X but not Y” — the kind of debugging that requires intuition about the system — it’s a distraction more than a help.

The CLAUDE.md investment

The most underrated feature is the CLAUDE.md file. It lets you encode context that persists across sessions: coding conventions, architectural constraints, what to avoid and why.

Here’s a fragment from one I use on a TypeScript project:

## Architecture constraints

- Never use `any` — use `unknown` and narrow it explicitly
- All API responses are validated with Zod before use
- Form validation schemas live in `src/schemas/`, not in component files
- Side effects go in `src/infrastructure/`, never in domain or application layers

## What to avoid

- Don't use `useEffect` for data fetching — use React Query
- Don't add `console.log` in application code
- Don't write catch blocks that swallow errors silently
- Don't add optional chaining where the value should always be present — fix the type instead

The ROI on this investment is asymmetric. Thirty minutes writing a good CLAUDE.md eliminates the same correction repeating across every session for months.

The mental model shift

The most useful reframe: Claude Code is a fast, knowledgeable, slightly overconfident collaborator with no memory, no domain context, and no stake in the outcome.

Treating it like a search engine — “generate X” — produces mediocre results. Treating it like a pair programmer with these specific characteristics — good at patterns, needs explicit context, requires review, confident even when wrong — produces good results.

The engineers who struggle with Claude Code are often the ones who treat it as an oracle and skip review, or who treat it with so much skepticism that they only use it for tasks so simple they could have done them faster manually.

The balance is: trust the output for low-stakes operations, verify the output for anything that matters, and invest in context — CLAUDE.md, specific prompts, accurate descriptions of constraints — before expecting accuracy on complex operations.

Net assessment

Six months in: I write less code than I used to. I review more code than I used to. I spend more time writing specifications, thinking about constraints, and reviewing output than I did writing code myself.

For experienced engineers who are already good at specification and review, the productivity gain is real — probably 30–40% on tasks where it’s applicable. For tasks that require domain knowledge, system intuition, or careful constraint navigation, the gain is smaller or negative once you account for careful review time.

It didn’t replace engineering judgment. It changed what engineering judgment is applied to.