Field Notes

Agent Efficiency Research

Reconciled findings from studies on AGENTS.md file impact.

The question of whether agent context files (like CLAUDE.md or AGENTS.md) actually improve AI coding agent performance has been studied empirically. Two prominent studies reached seemingly contradictory conclusions. This page reconciles those findings into actionable guidance.

The Two Studies

Study 1: AGENTS.md Improves Efficiency

Source: arXiv:2601.20404

This study measured the impact of including an AGENTS.md file on AI coding agent performance across a range of tasks. The findings were strongly positive:

  • 28.64% faster runtime — agents completed tasks significantly more quickly when given structured project context.
  • 16.58% fewer tokens consumed — agents required less back-and-forth reasoning, fewer retries, and fewer exploratory reads of the codebase.

The study's conclusion was straightforward: providing agents with a context file that describes the project, its conventions, and its structure leads to measurably better outcomes.

Community Finding: Authorship Determines Value

Source: Popularised by Theo (t3.gg, YouTube / X) — no peer-reviewed paper exists; treat as community-reported finding, not verified research.

This finding introduced a critical variable that the first study did not isolate: who wrote the context file. The results diverged sharply based on authorship:

  • LLM-generated context files: Performance decreased by ~3%, while token costs increased by ~20%. The agent spent more tokens processing information that was either redundant (things it could discover on its own) or subtly misleading (plausible-sounding but imprecise descriptions of the codebase).
  • Human-written context files: Performance improved by ~4%. The agent benefited from concise, accurate pointers to information it could not easily discover through code exploration alone.

The Contradiction

At first glance, these findings appear to disagree. The verified study says context files help. The community finding says they can hurt.

The Reconciliation

The findings are not contradictory — they are measuring different things. The verified study demonstrated that the concept of a context file is sound. The community finding demonstrated that the implementation matters enormously.

The reconciled finding is:

Agent context files improve performance when they are human-written, concise, and focused on non-discoverable information. They degrade performance when they are LLM-generated, verbose, or redundant with what the agent can already determine from the codebase.

This is not a subtle distinction. The difference between a human-written file (+4% performance) and an LLM-generated file (-3% performance, +20% cost) is a 7 percentage point swing in performance and a significant increase in cost — from the same category of intervention.

Key Takeaways

Keep the file, but keep it tight

The data is clear that having a context file is better than not having one. Do not skip it. But do not let it grow unchecked.

Keep it concise

Every line in the file costs tokens on every agent invocation. Lines that provide genuine, non-discoverable context earn their cost back through faster, more accurate task completion. Lines that restate the obvious consume tokens without providing value. Anthropic's guidance is qualitative: keep it focused. As an editorial recommendation, aim for the shortest file that covers essentials and accumulated corrections — typically well under 100 lines.

Focus on non-discoverable information

The agent can read your code. It can read your package.json, your tsconfig.json, your directory structure, and your test files. What it cannot discover on its own:

  • Why a particular architectural decision was made
  • Which conventions are intentional versus accidental
  • What common mistakes to avoid that are specific to your codebase
  • How modules relate to each other in ways not obvious from imports

This is the information that belongs in the file.

Use compounding engineering

The best context files are not written in one sitting. They accumulate corrections over time. When the agent makes a mistake — uses the wrong date format, puts a file in the wrong directory, uses a deprecated API — fix the mistake and add a one-line correction to the context file. Over weeks and months, these corrections compound into a highly targeted, highly effective set of instructions.

Never use LLM-generated content

Do not ask an AI to write or expand your CLAUDE.md. The data shows this reliably degrades performance. LLMs tend to produce plausible-sounding but generic descriptions that add token cost without adding signal. The value of a context file comes from the specific, hard-won knowledge of humans who work in the codebase every day.

Implications for Teams

These findings have practical implications for how teams manage their agent context files:

  • Treat the file as code. It should be version-controlled, reviewed in pull requests, and subject to the same quality standards as production code.
  • Assign ownership. Someone on the team should be responsible for reviewing and pruning the file quarterly.
  • Measure impact. If you are tracking agent performance metrics (task completion rate, token usage, time to completion), measure before and after changes to the context file to validate that additions are actually helping.
  • Resist the urge to be comprehensive. The instinct to "document everything" works against you here. Completeness is not the goal — precision is.

On this page