After spending the better part of two years integrating AI into my daily engineering work, I've arrived at some conclusions that might surprise you. The productivity gains are real—but they're not where most people think they are.
Let me share what's actually working.
The Uncomfortable Truth About AI-Assisted Coding
Here's something I rarely see discussed: the engineers getting the most value from AI tools aren't the ones generating the most code. They're the ones who've learned to use AI as a thinking partner rather than a typing assistant.
When I first started using tools like GitHub Copilot, I measured success by how many lines of code I could generate. That metric completely missed the point. The real value wasn't in the code—it was in the conversations that shaped my understanding of the problem.
Now, before I write a single line, I'll describe my problem to Claude or chat with my editor's AI. Not to get code, but to pressure-test my assumptions. "What edge cases am I missing?" or "What would a senior engineer question about this approach?" These conversations have prevented more bugs than any amount of generated code ever could.
Addy Osmani captured this perfectly in his 70% problem observation: AI can get you most of the way to a solution remarkably fast, but that last 30%—edge cases, security, production integration—remains as challenging as ever. The trap is that because the first 70% came so easily, you might underestimate how much work remains. As he puts it: "The goal isn't to write more code faster. It's to build better software."
The Tools That Actually Changed My Workflow
I've tried nearly every AI coding tool that's come out. Here's where I've landed after all that experimentation:
For thinking through architecture and complex problems: I use Claude (either through the chat interface or Claude Code in my terminal). When I'm facing a gnarly problem—a race condition I can't quite pin down, or a refactoring that touches too many files to hold in my head—I need an AI that can reason deeply. The conversation format matters here. I'm not looking for code; I'm looking for a thought partner who can help me see what I'm missing.
For the actual coding grind: Cursor has become my daily driver. What makes it work isn't the code generation—it's the codebase awareness. When I ask it to implement something, it understands my project's patterns, my existing types, my folder structure. That context makes the difference between suggestions I can use and suggestions I have to rewrite. Cursor's RAG-like system that gathers context from your local filesystem gives it an edge that pure chat-based tools can't match.
For quick iterations and prototypes: GitHub Copilot still earns its place for inline completions. When I know exactly what I want and just need to type it faster, nothing beats the flow of having it complete my thoughts mid-keystroke.
For terminal-based workflows: Claude Code has genuinely surprised me. Armin Ronacher—the creator of Flask—described it perfectly in his recent YouTube talk on agentic coding: "A friend called Claude Code catnip for programmers and it really feels like this. I haven't felt so energized and confused and just so willing to try so many new things... it is really incredibly addicting."
Running an AI agent directly in my terminal, with full access to my filesystem and the ability to run commands, felt reckless at first. But with proper oversight, it's become invaluable for those multi-file refactorings that would otherwise eat an entire afternoon.
Insights From Those Who Build These Tools
One of the most valuable perspectives I've encountered comes from Boris Cherny at Anthropic, who shared how their team actually uses Claude Code. A few insights that changed how I work:
Plan mode before action. People new to coding with AI agents often assume Claude Code can one-shot anything, but Cherny says that's not realistic. You can double or triple your chances of success on complex tasks by switching to "plan mode"—which has Claude map out what it's going to do step-by-step—and aligning on an approach before any code gets written.
Sub-agents that challenge each other. Cherny uses sub-agents—separate instances of Claude working in parallel—to catch issues before code gets merged. His code review command spawns several sub-agents at once: one checks style guidelines, another combs through the project's history, another flags bugs. Then he uses more sub-agents specifically tasked with poking holes in the original findings. "In the end, the result is awesome—it finds all the real issues without the false ones."
Stop hooks for automation. You can set up a stop hook that runs your test suite, and if any tests fail, it tells Claude to fix the problem and keep going instead of stopping. "You can just make the model keep going until the thing is done."
The Agentic Coding Philosophy
Armin Ronacher's insights on agentic coding have been particularly influential in how I think about these tools. A few things that stuck with me from his talks and writings:
The terminal is your friend. There's been a surprising comeback of terminal-based interfaces. Claude Code, Amp, Gemini CLI—all of them are TUI-first. As Armin notes, when your agentic coding tool can run commands in a terminal, you can mostly avoid complex integrations. Instead of adding a new MCP tool, write a script or add a Makefile command and tell the agent to use that instead.
Simple project structures win. Simple project structures and well-known languages work best with agents. Go, PHP, and "basic Python" are ideal because they have less ecosystem churn, clear patterns, and rely heavily on standard libraries. If you're working in a framework that changes every six months, the AI will struggle with outdated patterns.
Design for agent recovery. Design additional tools that provide very clear errors so agents can recover when something goes wrong. The agents aren't perfect, but they're surprisingly good at fixing their own mistakes if you give them useful feedback.
Parallelism is the multiplier. The real secret to productivity gains isn't any single tool—it's parallelism. In traditional development, you work on tasks sequentially. With async AI agents, you can have multiple instances running on different git worktrees, each tackling a separate task. Your role shifts from implementer to orchestrator: you provide direction, review output, and course-correct when needed.
What I Wish I'd Known Earlier
AI makes experienced engineers faster, but it doesn't make inexperienced engineers better. This was the hardest lesson. When I watch a junior developer use AI, they often accept suggestions too readily. They don't have the pattern recognition to spot when the AI is confidently wrong. The AI gives them speed at the cost of understanding.
Osmani observed this same pattern: "Here's the most counterintuitive thing I've discovered: AI tools help experienced developers more than beginners." Watch a senior engineer work with AI, and you'll notice they're constantly refactoring the generated code, applying years of hard-won engineering wisdom to shape and constrain the AI's output.
If you're earlier in your career, resist the temptation to let AI do the thinking. Use it to learn, not to skip learning. Ask it to explain the code it generated. Ask why it chose one approach over another. Make it teach you.
The 70% problem requires budget awareness. AI can get you 70% of the way to a solution remarkably fast. But that last 30%—the edge cases, the error handling, the production considerations—remains as hard as ever.
I've learned to budget my time accordingly. When AI helps me scaffold something quickly, I don't celebrate. I shift my focus to the harder parts that AI struggles with. The companies seeing 90% speedups are often doing greenfield development without technical debt, without all the baggage that comes with traditional software engineering on something real that has existed for a while.
Your prompting skills matter more than the model. I've seen engineers complain that an AI tool "doesn't work" while I'm getting excellent results from the same tool. The difference is rarely the tool—it's how you communicate with it.
Armin makes a compelling point here: without rigorous rules that you consistently follow as a developer, simply taking time to talk to the machine and give clear instructions outperforms elaborate pre-written prompts. He even uses voice input to stream-of-consciousness describe what he wants, because speaking can be faster than typing and encourages providing more context.
How I Structure My AI-Assisted Day
Here's what a typical day looks like now:
Morning (design and planning): I use conversational AI to think through whatever I'm about to build. I describe the problem, explore trade-offs, and identify what could go wrong. This is high-leverage time—ten minutes here saves hours later. Sometimes I'll use plan mode explicitly to have the AI map out an approach before touching any code.
Deep work blocks (implementation): This is where Cursor or similar editor-integrated AI earns its keep. I'm writing code, but with a copilot that understands my codebase. I accept probably 60% of suggestions and modify another 25%. The remaining 15% I reject and write myself.
Review and refinement: Before committing, I'll often ask AI to review what I've written. "What bugs might be hiding here?" or "How would you improve this?" It catches things I miss when I'm too close to the code. Cherny's sub-agent approach—having multiple AI reviewers challenge each other's findings—has made my reviews significantly more thorough.
Context management: For longer sessions, I've learned to use commands like /compact (which intelligently summarizes conversation while preserving key insights) instead of starting fresh and losing valuable history. The goal is strategic context management, not just accepting whatever the AI does.
Learning and exploration: When I encounter something new—a library I haven't used, a pattern I don't recognize—AI is my first stop. Not to have it write the code for me, but to have it explain concepts and show examples until I understand.
The Practices That Protect Code Quality
AI-generated code can be seductive. It compiles. It looks reasonable. But I've learned to be skeptical in specific ways:
I read every line. This might sound obvious, but it's easy to let your guard down when code appears fully formed. As Osmani puts it: "AI is a tool. If your name is on there, when that code is getting submitted, you are responsible for what you submitted. So you need to make sure that you're reviewing it."
Never commit code I can't explain. This is a golden rule. AI can help me write code faster, but it can't help me debug code I don't understand. If I ask AI to do a thing and it ends up rewriting code in five different places, I go back and understand how it all works.
I verify claims. AI will confidently tell you that a library works a certain way. Sometimes it's right. Sometimes it's hallucinating based on outdated training data. I always verify before depending on AI's assertions about external systems.
I keep things modular. AI works better with smaller, focused pieces. So I've become more disciplined about decomposition—not just for AI, but because it turns out small, focused modules are easier for humans to understand too.
Test-driven development becomes more valuable, not less. Having the AI write tests first enforces discipline. I'll often say: "Write comprehensive tests for this function. Do NOT implement yet." Then review the tests, and only then ask for implementation. This catches many issues before they become problems.
The Honest Assessment
Not everything works. The METR randomized controlled trial found that experienced open-source developers actually took 19% longer with AI tools on their own repositories. That surprised a lot of people, but it makes sense when you consider the overhead: context-switching, reviewing AI output, course-correcting when it goes wrong.
The productivity gains are real but nuanced. GitHub's studies show 55% faster task completion for routine coding, while enterprise deployments report up to 67% reductions in code review turnaround. But gains are more modest for complex tasks, and the disruption caused by overhauling existing processes often counteracts the increased coding speed.
The JetBrains State of Developer Ecosystem 2025 found that 85% of developers regularly use AI tools, and nearly nine out of ten save at least an hour every week. But here's the uncomfortable truth: 66% of developers don't believe current productivity metrics reflect their true contributions. We're measuring the wrong things.
What This Means Going Forward
I don't think AI is going to replace software engineers. But I do think engineers who effectively use AI are going to outperform those who don't. The gap is already visible—some engineers have genuinely leveled up their output without sacrificing quality.
The key insight is this: AI changes what we optimize for. Writing code was never the hard part of software engineering. The hard parts—understanding requirements, making good design decisions, handling edge cases, debugging complex systems, communicating with stakeholders—remain as human as ever.
As Osmani observed at the AI Engineer Code Summit: "Maybe a terminal and a filesystem is all you need." The models are smart enough now that elaborate scaffolding seems unnecessary. What matters is the human judgment—the ability to recognize when the AI is wrong, to provide clear direction, and to maintain the engineering discipline that keeps code maintainable.
If you're just getting started with AI in your workflow, my advice is simple: start small, stay curious, and never let AI do your thinking for you. Let it augment your thinking instead.
The engineers who thrive in this new era won't be the ones who generate the most AI-assisted code. They'll be the ones who use AI to become better engineers—more thoughtful, more thorough, and more focused on what actually matters: building software that works.