Tackle the Monkey
Stop wasting your time building pedestals.
A few years ago, I found myself sitting in a conference room at the Google X headquarters. We were there to talk about how Labs and X might collaborate—two parts of Google with different mandates but similar ambitions. Astro Teller, an incredible character who leads Google X, cruised into the room on his famous rollerblades, immediately grabbing our attention.
As he was introducing the philosophy underlying the approach he strives for at X, he shared a note that I keep coming back to as I think about the world we see today. It began with a simple enough statement:
“If I ask you to build a car that gets 50 miles per gallon, you’ll make minor adjustments to the engine. If I ask for 500 miles per gallon, you have to throw out everything you know about cars and start over.”
The point: aiming for 10x is often easier than aiming for 10%.
It sounds backwards. But when you aim for 10%, you’re competing on the same field as everyone else by tweaking existing solutions and achieving marginal results. When you aim for 10x, you’re forced to completely rethink the problem.
That conversation has been rattling around my head lately, because everywhere I look I see teams making the same mistakes with AI coding. You’re aiming for 10% gains, when you need to be shooting for a total revolution in how you work.
Over the last two years, I’ve gotten deep in the tools of agentic coding. I regularly push the limits of Antigravity, AI Studio, Claude Code, Cursor, Codex and Lovable. Despite a lot of experimentation, I don’t yet have a crystal ball or recipe for how all of these capabilities will come together as a new way of building. But across nearly every discussion I’m following, especially with companies implementing AI systems within their engineering practices, the pattern of incremental thinking is everywhere: teams are wasting their time bolting AI onto their existing workflows and calling it transformation.
Stop fooling yourselves. You’re using a self-driving car to parallel park.
Astro’s framework is the best mental model I’ve found for understanding why most teams are wasting this moment. Let me explain.
Tackle the Monkey
Here’s Astro’s most famous metaphor. Say you’re trying to teach a monkey to recite Shakespeare while standing on a pedestal in Times Square. What do you work on first?
Most teams build the pedestal. It’s easy. Shows progress. Makes your boss happy. But if you can’t teach the monkey to talk, the pedestal is worthless.
The lesson: Attack the hardest, most uncertain problem first. Don’t waste resources on things you already know how to do.
So what’s the “monkey” in agentic coding?
It’s not the tooling—that works. It’s not integration—that’s solvable. The monkey is fundamentally rethinking what software development means when an AI can execute multi-step tasks autonomously with programatic oversight and deep research backing every decision.
Teams building better CI/CD pipelines? Building pedestals. Integrating code-complete systems into VSCode? Building pedestals. Creating internal RAG systems to consult existing code? Building pedestals. So many execs out there parading their golden pedestals as if it’s real innovation.
I want to see more teams completely redesigning their entire development workflow around AI. Many startups are doing this already, often out of necessity. Solopreneurs are vocal champions of this approach. They’re tackling the monkey and you need to be too.
I wish I had a playbook to sell you, but it hasn’t been written yet. The technologies are moving too fast and the best practices are still emerging. But there are good ideas and experiments showing results that you need to think though. Let’s talk about a few I find most interesting:
The search for 10x
I’ve been studying and experimenting with techniques for how the best teams are using these tools. Not the hype—the actual workflows. Here’s what separates incremental adopters from the teams making real leaps.
1. Institutional Memory
Boris Cherny, who created Claude Code, revealed his workflow recently. The key insight: every mistake becomes a rule.
His team maintains a CLAUDE.md file—about 2,500 tokens—that captures every pattern, every guideline, every error Claude shouldn’t repeat. When someone reviews a PR and spots an AI mistake, they don’t just fix the code. They update their institutional memory to ensure it never happens again.
The incremental approach: Write documentation for humans, occasionally paste context into AI chats.
The 10x approach: Your codebase becomes a learning system. The longer the team works together, the smarter the AI gets. Invest in virtuous cycles of learning from the start.
2. Verification Loops as Architecture
Here’s the single most important tip Cherny offers: give AI the ability to verify its own work. At Anthropic, Claude tests every change using browser automation. It opens the app, tests the UI, iterates until it works. This improves output quality by 2-3x. I use XCodeBuildMCP to automate this workflow when building iOS apps.
The incremental approach: AI generates code → human reviews → human tests.
The 10x approach: Design for autonomous verification from day one. AI runs tests, observes results, iterates. You review working code, not hopeful code. Human effort is expensive, before you deploy it make sure it’s of the highest value.
3. Tests as Prompts
Test-driven development is having a moment—but for a different reason than before.
When you’re working with AI agents, tests become the specification language. You write tests that define correct behavior. AI iterates until tests pass. As Simon Willison puts it: tests give you a reliable exit criteria. You’re not relying on the AI’s whims.
The workflow flips:
Plan → Use a thinking model to generate a phased plan
Red → Write a test expressing desired behavior
Green → Let the agent implement minimal code to pass
Refactor → Ask AI to clean up while keeping tests green
Validate → Verify it works end-to-end
The incremental approach: Write code, then write tests, occasionally ask AI for help.
The 10x approach: Tests are prompts. A well-written test is a natural language spec that guides AI toward exactly the behavior you expect.
4. Parallel Agent Orchestration
The best AI first developers run 5-10 instances simultaneously. Five locally in terminal, 5-10 more in the browser. Each local session uses its own branch to avoid conflicts. As humans we’re used to running serial processes. Computers are multi-threaded for a reason and if you’re not taking advantage of this truth, you’re wasting your time. When you get it right, it’s altogether different:
“It feels more like Starcraft than traditional coding.”
That’s the shift. From typing syntax to commanding autonomous units.
The incremental approach: One developer, one AI assistant, one task at a time.
The 10x approach: You’re directing an entire team of AI agents, not typing code faster.
5. Plan Mode is the New PRD
In the not so distant past, product managers wrote PRDs to make sure designers and engineers understood exactly what to do, why to do it and the targeted business outcomes we seek. In the new world within which we live, Plan Mode is beginning to change what it means to draft a PRD and in many ways changes the purpose entirely. Rather than spending weeks and months pre-planning engineering investments, what happens when the cost to build falls below the cost to plan? Everything changes.
Today, 10x teams are reinventing the PRD with clever use of AI coding systems and Plan Mode. Again, I’ll point to Boris Cherny
“If my goal is to write a feature, I will use Plan mode, and go back and forth with Claude until I like its plan. From there, I switch into auto-accept edits mode and Claude can usually 1-shot it. A good plan made for agentic understanding is really important!”
Without explicit planning, AI tends to jump straight to coding. Asking your AI coding system to research and plan first dramatically improves results. There’s an entire article in my head about how to do this really well, but that’ll wait for another day. Let me know if you want me to write about this next!
The incremental approach: Prompt → AI generates code → you review.
The 10x approach: Spend 80% of interaction time on planning, 20% on execution. This is the inverse of how our systems currently work—but optimal for AI collaboration.
The Pedestal Mistakes
These are the incremental approaches teams default to. Comfortable, familiar, fundamentally missing the point:
Better autocomplete — Using AI to finish lines faster, not to rethink how code gets written
Faster Stack Overflow — Asking AI questions you could Google, not letting it autonomously solve problems
Single-session chatting — No persistent memory, no learning, starting from scratch every time
Human-in-every-loop — Requiring approval for every action instead of designing for autonomous verification
One AI, one task — Never parallelizing, never orchestrating, never treating AI as a team
If your workflow looks like this, you’re building pedestals.
The Cultural Shift
Here’s where Astro’s framework gets uncomfortable.
He talks about creating context for moonshot thinking. The culture required. And he’s blunt about what kills it:
“If people are surrounded by business speak, you will ruin it all. If they believe that they have to have a business plan for the weirdness that they are embarked upon, you will kill it—stillbirth guaranteed.”
The teams that will win with agentic coding need psychological safety to fail spectacularly. The first attempts at AI-native development will be messy. Organizations that punish these experiments will lose to those that celebrate the learning.
Astro calls this being “responsibly irresponsible”—radical ambition coupled with disciplined execution.
That’s the balance. Not naive enthusiasm. Not cautious incrementalism. Bold hypotheses, rapid learning, honest assessment.
“The secret? It’s easier to get people to work on making something 10X better than to get them to help make it 10 percent better. Huge problems fire up our hearts as well as our minds.”
The AI Coding revolution isn’t about a new tool. It’s about a paradigm shift in how software gets built. The question isn’t “How do I use Claude Code?”
It’s “What does software development look like when AI agents can autonomously read codebases, plan features, implement changes, run tests, and iterate?”
That’s the monkey.
Are you tackling it? Or building pedestals?
Builder’s note: I’ve been experimenting with these workflows at Google Labs for the past few months. The productivity gains are real, but only when you stop treating AI as autocomplete and start treating it as a collaborator with vastly more freedom to build. The learning curve is steep. I’m convinced the payoff is worth it.



