The Best Thing the Agent Did Was Fail

A new member joined our team and picked up their first task: add monitoring to our CloudFormation stack. This seemed simple enough.

They got to work, and came back with something that was functional but not up to par. Alarm thresholds were off, naming conventions were wrong, and the implementation didn’t follow our infrastructure standards.

The fix was straightforward. We pointed them to some working examples and a doc outlining our team standards. The second attempt was noticeably better. As you might have guessed, the new team member was an AI agent.

Drinking from the Firehose

You wouldn’t assign a new hire a task on day one and just send them on their way.

Hopefully, you’d set them up with some context that explains how to set up their workspace, how the team organizes code, what patterns you follow, and what patterns you avoid. The unwritten rules that exist on every team but nobody bothered to write down because everyone assumed they were obvious.

Onboarding an AI agent is no different. It’s capable, fast, but completely blind to your team’s norms. Without a new hire handbook, it will produce something that technically works, but not something your team might actually merge. To get them up to speed, you’ll need to ground them with team guidelines and documentation, so they don’t make it up as they go along.

Undefined Behavior

What information did we have to provide our AI new hire? It primarily fell into two buckets.

General best practices

Even with the best models available, we still had to remind the agent of things that should be table stakes. For example, the single responsibility principle. Left on its own, the agent would happily create one function to do everything. One function to find them, one function to bring them all, and in the darkness bind them. My precious function…

Language specific idioms were another one. Our agent would frequently write Python like it was Java. Getter and setter methods on classes, abstract factories where none were needed. Technically functional, but the kind of code that makes the reviewer’s eyes twitch.

But these weren’t really problems with the agent. They were prompting problems, fixed with a few small tweaks.

Team best practices

This is where the bulk of the blind spots were. Beyond general code hygiene, the agent needed to understand how our team operates. This includes our internal build tools, testing frameworks, company wide guidelines, and team preferences. The kind of context scattered across a dozen wiki pages.

The uncomfortable discovery was that a lot of this information didn’t live anywhere. If it wasn’t in Confluence, it lived in the heads of whoever had been on the team the longest. And the agent can’t access this information easily, not yet anyway.

A Picture Is Worth a Thousand Tokens

Another thing that made a noticeable difference was providing examples or code snippets of common patterns. It didn’t have to be a complete implementation, just enough to push the agent in the right direction.

This includes a rough example of how we structure a package, organize a class, or write a test case. Something that says, “In this codebase, this is roughly how we do things.” Similar to showing a new hire a completed task before having them start one on their own.

As for where these examples live, there are a few options:

Guidelines can live directly in agent rules, steering docs, or agent.md files.
Code examples can be chunked into a vector database and retrieved with RAG when needed.

Wiki docs are still usable too. They can serve both AI agents and humans, and with a well defined MCP integration they’re easy to retrieve.

Root Cause

Turns out we didn’t have a bad agent. We had an onboarding process held together with institutional memory and good intentions.

The gaps the agent exposed weren’t new either. They were just invisible. New hires had probably run into the same walls and filled in the blanks by asking the right person at the right time, or by asking the obvious questions during standup.

When the agent couldn’t do either of those things, it failed visibly.

The process of capturing and consolidating this information for the agent ended up raising the bar for everyone. Tribal knowledge stored in people’s heads finally got written down. Best practices buried in Slack threads were eventually defined.

If the task of capturing this knowledge feels daunting, an agent can help generate these docs for you.

Task Failed Successfully

The best thing the agent did was expose what was missing.

If you can’t explain your team’s standards to an agent, you probably can’t explain them to a new hire either. Writing this information down raises the bar for everyone and saves countless revision cycles during code reviews.

The agent didn’t create the gaps. It just made them painfully obvious to the team.