The Best Thing the Agent Did Was Fail

The gaps were always there. We just needed something to fall into them.


A new member joined our team and picked up their first task: add monitoring to our CloudFormation stack. This seemed simple enough.

They got to work, and came back with something that was functional but not quite up to par. The alarm thresholds were off and the overall implementation didn’t follow our infrastructure standards.

However, the fix was straightforward. We pointed them to some working examples and a doc outlining our team standards. Their second attempt was noticeably better. As you might have guessed, the new team member was an AI agent.

Drinking from the Firehose

You wouldn’t assign your new hire a task on day one and just send them on their way.

Hopefully, you’d set them up with some context that explains how to set up their workspace, how the team organizes code, what patterns you follow, and what patterns you avoid. The unwritten rules that exist on every team because nobody bothered to write them down.

Onboarding an AI agent is no different. It’s capable, fast, but completely blind to your team’s norms. Without a new hire handbook, it will produce something that technically works, but not something your team might actually merge. To get them up to speed, you’ll need to ground them with team guidelines and documentation, so they don’t make it up as they go along.

Undefined Behavior

What information did we have to provide our AI new hire? It primarily fell into two buckets.

General best practices

Even with the best models available, we still had to remind the agent of things that should be table stakes. For example, the single responsibility principle. Left on its own, the agent would happily create one function to do everything. One function to find them, one function to bring them all, and in the darkness bind them. My precious function…

Language specific idioms were another one. Our agent would frequently write Python like it was Java. Adding getter and setter methods on classes and abstract factories where none were needed. Technically functional, but the kind of code that makes your eyes twitch when you review it.

However, these were basic prompting problems and easily fixable with a few small tweaks.

Team best practices

This is where the bulk of the blind spots were. Beyond general code hygiene, the agent needed to understand how our team operates. This includes our internal build tools, testing frameworks, company wide guidelines, and team preferences. The type of information typically scattered across a dozen wiki pages.

What we discovered was that a lot of this information didn’t live anywhere. If it wasn’t in Confluence, it lived in the heads of those that had been on the team the longest. And the agent can’t access this information easily, not yet anyway.

A Picture Is Worth a Thousand Tokens

Another thing that made a noticeable difference was providing examples or code snippets of common patterns. It didn’t have to be a complete implementation, just enough to push the agent in the right direction.

This includes a rough example of how we structure a package, organize a class, or write a test case. Something that says, “In this codebase, this is roughly how we do things.” Similar to showing a new hire a completed task before having them start one on their own.

As for where these examples live, there are a few options:

  • Guidelines can live directly in agent rules, steering docs, or agent.md files.
  • Code examples can be chunked into a vector database and retrieved with RAG when needed.

Wiki docs are still usable too. They can serve both AI agents and humans, and with a well defined MCP integration they’re easy to retrieve.

Root Cause

Turns out we had an onboarding process held together with institutional memory and good intentions.

The gaps the agent exposed weren’t new. They were just invisible. Previous new hires had probably stumbled on the same things but filled in the gaps by asking the right person at the right time, or by asking the obvious questions during standup.

When the agent couldn’t do either of those things, it failed visibly.

The process of capturing and consolidating this information for the agent ended up raising the bar for everyone. Tribal knowledge stored in people’s heads finally got written down, and best practices buried in Slack threads were eventually defined.

If the task of capturing this knowledge feels daunting, keep in mind an agent can help generate these docs for you as well.

Task Failed Successfully

The best thing the agent did was expose what was missing.

If you can’t explain your team’s standards to an AI agent, you probably can’t explain them to a new hire either. Writing this information down raises the bar for everyone and saves countless revision cycles during code reviews.

The agent didn’t create the gaps. It just made them painfully obvious to the team.


Update [March 16, 2026]

I may have leaned in too hard into the new hire analogy and made things seem a bit too static compared to how agent context works in practice.

Obviously, what we provide an agent isn’t a fixed stack of docs. There are different mechanisms available to load context on-demand. For example, tooling like Claude agent skills or Kiro powers allow agents to dynamically retrieve information and tooling based on their task. This prevents you from having to load all these resources upfront, and keeps your token usage from overflowing on every single request.

I covered something similar in a previous post , where more context doesn’t mean better context. Relevance still matters and dumping every wiki page and tool definition into an agent’s context window doesn’t typically produce better results.

That being said, don’t over-engineer this. You’ll consistently get better results by keeping your context docs brief and tool lists trimmed.