Foundation
AI is an excavator. You are the operator.
An excavator is an extraordinary machine — it can move in hours what would take a crew with shovels weeks to accomplish. But here's the thing: someone who has never worked a construction site doesn't become a builder by sitting in the cab. They'll dig in the wrong place, hit a gas line, or undermine the foundation they're supposed to be building on.
A skilled operator — someone who understands soil types, structural engineering, site plans, and safety protocols — can do extraordinary things with that machine. They know where to dig, how deep to go, when to stop, and what the consequences are of every decision. The machine amplifies everything they know.
AI works exactly the same way. It amplifies skilled operators and is dangerous for unskilled ones. Your domain knowledge, your understanding of what the customer needs, your architectural judgment, your sense of what "done" means — that's what makes AI powerful. Without it, you get confident, fast, wrong output.
Every principle below is a variation on this: how to be a better operator. (For the emotional side of this shift — especially if you're a senior developer struggling with letting go — see It Wasn't Wrong. For the practical dos and don'ts, see Shifting Gears.)
The Four Principles
This is the single most important thing to understand about working with AI. The context window is the only thing the model has. It doesn't remember yesterday. It doesn't know your project history. It doesn't carry forward the brilliant solution you worked on last week. Every time you start a conversation, the model is a world-class expert who just woke up with total amnesia in an unfamiliar room.
Your entire job as the operator is to fill that room with the right information — fast, efficiently, and completely.
What this means in practice
"Update the component"
"Apply a good security pattern"
The difference isn't verbosity — it's completeness. AI doesn't get confused by complexity. It gets confused by ambiguity. Having 300 concerns about a feature? List them all. Fifteen edge cases? Enumerate them. AI is exceptional at juggling complexity when you make it explicit.
The self-fulfilling prophecy
If you're skeptical about AI and think "so much can go wrong", you'll limit the task to the bare minimum to stay safe. Three-line prompt, minimal context, crossed arms. The AI produces mediocre output. "See? I told you it wasn't that good."
But you starved it. You gave the excavator operator a site plan drawn on a napkin and complained when the foundation was crooked.
Anthropic's own randomised controlled trial confirmed this: passive delegation — minimal input, maximum output — scored below 40% on comprehension. Strategic use with rich context scored 65% and above. The way you use AI matters more than which model you use.
Don't be afraid to "look stupid"
Dump your messy thoughts. All your rollout concerns, domain awkwardness, that flaky e2e test, your speculations about whether approach A or B is better. Share what's fixed, what's fuzzy, and what you're worried about. Let the AI sense that some things aren't hard instructions but open questions it should explore and challenge. Invite pushback: "challenge my assumptions before coding."
If you've ever watched AI confidently edit thirty files and then realised it missed two of your crucial cross-cutting concerns — your permission checks, your analytics pattern, your naming conventions — you know the pain. AI went ahead, the change "worked", but it broke a downstream job and ignored rules that were obvious to you.
The fix is almost always the same: you didn't plan.
How to plan
Step 1: Explain before coding. Describe scope, assumptions, risks, testing strategy, and your rough idea of an approach. Don't give AI the keyboard yet — keep it in a planning or thinking mode. The AI will uncover ideas or concerns you didn't think about yourself. That's the point.
Step 2: Don't settle for the first plan. Insist on iterating until you feel confident. The AI's first proposal is a draft, not a contract. Push back, ask for alternatives, probe the trade-offs.
Step 3: Re-plan when requirements change. Don't let AI continue building because it's in the middle of something. If you had an epiphany — small or big — break in, stop the agent, and tell it about your observation. Go back to planning mode. The excavator operator doesn't keep digging when the architect changes the blueprint.
Step 4: Use ask-mode for complex domains. Force the model to batch questions before acting. This results in fewer wrong turns and more accurate solutions.
Why planning is the new code review
In the old world, we reviewed code after it was written — pull request comments, style nitpicks, architectural debates over existing diffs. In the AI world, the code itself is generally excellent. AI produces better naming, better comments, better structure than most humans' first pass. Where it fails is at fundamental architectural decisions — the kind that should have been caught before a single line was written.
Shift your quality assurance upstream. Review the plan, not the code. If the plan was good and the model is capable, the code is virtually guaranteed to be solid.
What are skills?
A skill is a reusable, structured package of context and instructions that an AI agent can auto-select and apply. Think of it as a slash command on steroids: it has a metadata layer (so the LLM knows when to use it), a detailed instruction body, and linked files for deeper context — scripts, templates, examples.
The key difference from prompt snippets: skills are auto-selected by the LLM based on the task. You don't have to remember to paste them. The model recognises "this task matches the deployment skill" and loads the right context automatically.
Why skills matter
Remember Principle 1: the context window starts blank every time. Skills are the mechanism that solves this. Instead of manually reconstructing context every session, skills inject your accumulated knowledge — your architectural principles, your review standards, your deployment procedures — automatically and efficiently.
The people who thrive with AI are the ones who constantly grow their library of skills. Every time you find yourself giving the same instruction twice, that's a signal: encode it as a skill. Unlike humans, who forget corrections and drift back to old habits, skills are permanent.
The anatomy of a skill
- YAML frontmatter — metadata for auto-selection: name, description, trigger patterns.
- SKILL.md body — the detailed instructions, conventions, and decision logic.
- Linked files — scripts, templates, examples, configuration.
What should become a skill?
- Review procedures — how code should be reviewed, what patterns to enforce
- Architecture guidelines — your conventions and non-negotiables
- Deployment and operations — how to deploy, rollback, check
- Domain-specific knowledge — business logic, compliance, edge cases
- Testing philosophy — what to test, how, what coverage means
- Documentation standards — how docs should look and when to update
This principle didn't exist in the original cookbook. It should have. Six months later, I can tell you from personal experience: AI burnout is real.
Why AI work is addictive
AI-assisted development is genuinely addictive, especially for senior developers. If you have the experience, know what you're doing, have an explorative mindset and curiosity — AI becomes an extraordinary amplifier. The impact is intoxicating. You ship things in hours that would have taken weeks.
And that's exactly where the danger lies. When impact becomes addictive, work becomes borderless. You can always pull out your phone and continue prompting on that task. Benj Edwards at Ars Technica described burning himself out across fifty simultaneous projects — the machine is tireless, but the human operator isn't.
The barcoding pattern
Think of your day as a barcode — strict black and white zones:
- Black zones: Deep creative work. Fully engaged, prompting, reviewing, building. The excavator is running at full power.
- White zones: Genuine rest. Brain off. No "just checking" the AI's progress on your phone.
- Grey zones: The danger. Sitting in front of the screen waiting for AI to finish. Neither resting nor productive. This is where burnout lives.
Eliminate the grey. Either you're operating the machine or you're away from it.
The new feature creep
The new feature creep: you can build everything, so you end up with a sprawl of features where you ask, "Do we actually need this?" You risk polluting the product with functionality that doesn't solve real problems. The antidote: know your customer, know what you're trying to achieve, and invest in upfront planning. When you can build at the speed of thought, discipline about what to build becomes the scarce skill.
Team structure and borderless work
- Set explicit expectations about work hours and availability
- Recognise that AI work has a different energy profile — intense bursts followed by genuine rest, not steady eight-hour days
- Watch for signs of borderless work: prompting at midnight, building on weekends because "it only takes a minute"
- Create team rituals that enforce white zones — stand-downs, no-build days
The excavator has an ignition key. Learn to turn it off.
Tactics That Saved Me Hours
This is the quantitative version of "context is king." Look at the balance between your input tokens and output tokens. You want a high input-to-output ratio — that's the symptom of thoughtful, well-contextualised prompting.
Front-loading means: using skills, providing meeting transcripts, dumping your concerns and observations into the context window, explaining edge cases, sharing what you've tried and what you're worried about.
What you don't want: a 4,000-line code output generated from a three-line prompt. That's the wrong workflow if you want precision and impact.
Rule of thumb: if your input is less than 20% of the total conversation tokens, you're probably under-specifying. The AI is guessing where you should be telling.
Just had a meeting discussing a new feature, a tricky bug, or a complex concept? Feed those straight into your AI session.
- Meeting transcripts from Gemini, Otter.ai, or any note-taking tool
- Ticket descriptions with full comment threads
- Slack conversations (copy the whole thread — don't cherry-pick)
- Design feedback from Figma comments or reviews
- Product specs with all the "what if" discussions
Your teammates already did the hard work of discussing, debating, and surfacing concerns. Give AI that rich context — it's like inviting AI to the meeting retroactively.
AI agents can open a browser, navigate your site, click buttons, fill forms, and observe what happens — while monitoring backend logs and frontend console errors simultaneously.
Stop describing visual issues in words. Tell AI: "Go and see for yourself at http://localhost:3000/dashboard." For production bugs, let it navigate production and compare with your local version. It will spot CSS differences, JS errors, API response discrepancies — things you'd spend fifteen minutes screenshotting and describing.
Drag in a screenshot. AI will analyse the image, spot the issues, and fix them. It's not just faster — it's more accurate because you're not translating visual details into words. AI can also work with visual assets directly: transform PNG to SVG, change colours, resize, optimise.
If you want real speed, AI needs to execute commands without constantly asking permission. In containerised dev environments — and you should be using them — everything runs sandboxed. AI can't delete your hard drive or access personal files. Allow-list commands you know are safe: package installs, migrations, test runners, linters, build commands.
Specialised review tools like CodeRabbit add value for a simple reason: they're essentially deeply refined skills maintained full-time by someone whose only job is optimising the LLM's behaviour for code review. Under the hood it's using one of the known LLMs — but with a tailored instruction set. That focused optimisation is worth paying for.
CodeRabbit consistently catches security and race-condition issues that the building agent misses because it's deep in solution mode. In high-pace AI-driven development, having an independent reviewer with a fresh context is invaluable.
Parallelising three independent, small tasks across background agents is powerful. But avoid concurrent agents touching the same concern — they step on each other's diffs, produce merge conflicts, and hide responsibility. When two agents edit the same files simultaneously, one gets ahead, specs start breaking for the other, and you end up with a cascading mess.
The fix: give each agent its own workspace branch or isolated sandbox. Never let two excavators dig the same trench from opposite ends without coordinating.
The context window has a limit. When you hit it, the model summarises the conversation — which means losing detail. This is the primary cause of agent drift: the AI is kind of doing what you want, but not exactly.
When to start fresh
- Your thread has drifted far from its original goal
- You're working on a different task than the one you started with
- The context meter is approaching capacity
- You notice the AI getting vague, repetitive, or losing specifics
The structural fix for drift
Delegate to sub-agents. Instead of running everything through a single long conversation, break work into focused tasks with isolated contexts. Each sub-agent gets a clear brief and returns a result.
"You say you're done — are you actually done? As in REALLY done?"
- All TODO/FIXME removed or ticketed
- All specs updated and passing
- Backwards-compatibility risks addressed
- Rollback plan noted
- Documentation updated
- Linters, types, formatters clean
- Edge cases from the planning phase actually handled
Never accept "done" without an audit. Make the AI prove it. Better yet, deploy automated validators that make it impossible to claim "done" when linters are failing or translations are missing.
Supporting Practices
What to document
- Architecture — your conventions, patterns, and non-negotiables
- Domain — business logic, customer rules, compliance requirements
- Security — auth patterns, data handling, sensitive areas
- Operations — deployment procedures, monitoring, incident response
- Testing philosophy — what to test, how, what coverage means
Where to put it
Create a /docs directory with logical subfolders. Add agent configuration files at the repo root. Cross-reference so AI can navigate easily.
How to build it
Don't write documentation manually. Prompt AI to create it. Have AI analyse your project, ask follow-up questions, and generate docs from the actual codebase. Then review and refine. A few hours invested supercharges productivity going forward.
Documentation as skills
Your documentation isn't just for humans. It's input for skills. The architecture docs become the foundation of your "architecture review" skill. The testing philosophy becomes your "test generation" skill. Documentation and skills are two sides of the same coin — they both exist to efficiently inject context into blank conversations.
Principles that hold
- Use the best available model for complex, cross-file work. Planning, architecture, security — these deserve the most capable model.
- Drop down for focused, localised tasks. Quick refactors, formatting, simple CRUD — faster models are fine.
- Consistency matters more than peak capability. Pick a model and stick with it for a session.
- Always pay for the best. AI tooling costs are a rounding error compared to the productivity boost.
What's changed since October 2025
- Context windows have grown significantly. 200k+ tokens is standard; some models offer 1M+.
- Skills and structured context injection are now first-class features across most tools.
- Sub-agent delegation is now practical, not experimental.
- Quality variance between top providers has narrowed significantly.
Here be dragons: models are out-of-date by nature
Models are trained at a specific point in time — often more than a year ago. Confident AI ≠ correct AI. When working with libraries that have evolved since the model's training cutoff, explicitly pin versions and ask AI to validate against current docs.
An extremely powerful technique: ask AI to browse the installed library's actual source code. It can read method signatures, understand the current DSL, and suggest correct implementations from actual code on disk — rather than relying on stale training data.
Before engaging AI
- Problem statement, constraints, and success criteria written
- Version numbers known (runtime, frameworks, key libraries)
- Links to code, docs, and issues prepared
- Relevant skills loaded or accessible
- Meeting notes or conversation context gathered
While building with AI
- Planning mode used before coding
- Context front-loaded (high input-to-output ratio)
- Versions pinned; APIs validated against current docs or source
- Diffs kept manageable; tests requested alongside code
- Re-plan triggered when requirements change
- Sub-agents used for isolated, parallel tasks
Before merging
- "Done means done" audit passes
- Code review (AI review tool + human review of plan)
- Rollback/migration plan captured
- Documentation updated
- Linters, types, formatters clean
Protecting yourself
- Day structured as barcode (black/white, no grey)
- Clear boundaries on work hours
- Feature backlog reviewed for necessity, not just feasibility
- Skills library maintained and growing
Microsoft's Mustafa Suleyman predicts that most white-collar tasks involving "sitting at a computer" will be fully automated within 18 months. Whether his timeline is exactly right is debatable, but the direction isn't. The question isn't whether AI will transform how you work — it's whether you'll be the operator or the person wondering what happened.
A year ago, we might have been in a "1993 internet moment" — infrastructure emerging, killer apps not yet materialised. But in just one year, we've compressed decades of progress. The technology is there. Adoption isn't. The teams who get on board now, build their skills libraries, develop their operator instincts, and learn to manage the machine sustainably — those are the teams that will win.
Build for what models will do in six months, not just what they do today. The constraint you're designing around might disappear next quarter. The "magic moment" when capability catches up to your product design — you want to be ready for that, not scrambling to catch up.
The excavator gets more powerful every month. The question is whether you're learning to operate it.