Goodhart’s Law in AI Adoption Metrics
The mechanism
Goodhart’s Law states that when a measure becomes a target, it ceases to be a good measure. In AI adoption programs, this plays out through a specific escalation pattern:
- Mandate phase — Leadership announces AI transformation. Every team must adopt AI tools and report results.
- Competition phase — One team reports 40% productivity gains. The next reports 60%. A third claims 80% automation. Nobody checks methodology because nobody has a baseline.
- Gamification phase — Leaderboards appear tracking prompts per week, AI-generated code percentage, agent skills shipped. Individual contributors are ranked against peers.
- Inversion phase — The entire organization optimizes to make the adoption metric look good rather than to make the company better. The metric was the instrument; now it’s the objective.
Why this is worse than typical Goodhart failure
Standard Goodhart scenarios involve a known-good metric that gets gamed. AI adoption metrics were never good measures to begin with. “Percentage of code AI-generated” conflates quality with quantity. “Number of agent skills built” rewards complexity. “Time saved” is self-reported with no control group. The metric was born corrupted and then made a target.
The grain-reporting parallel
Hanchung Lee draws an explicit analogy to the Great Leap Forward’s grain reporting, where provinces competed to report impossible yields. Officials staged photographs of rice so dense children could stand on it. The central government, pleased by the numbers, increased requisitions based on reported yields. Farmers starved eating the difference between the real number and the fantasy.
The corporate equivalent: leadership increases AI investment based on reported gains, reallocates headcount based on claimed automation, and sets next quarter’s targets based on this quarter’s fiction.
Key Takeaways
- AI adoption metrics are uniquely vulnerable to Goodhart failure because no established baselines exist for most knowledge work.
- The competitive dynamic between teams creates an escalation spiral where each report must outdo the last.
- Self-reported productivity gains without methodology review are not data — they are political signaling.
- The corrective is measuring outcomes (revenue, error rates, customer satisfaction) rather than inputs (AI usage, prompts written, skills shipped).
Related Notes
- AI Productivity - The Micro-Macro Disconnect — the micro-level gains are real in controlled studies but unmeasurable in the wild, which is exactly the gap that fabricated metrics fill
- AI Transformation Requires Strong Form Org Redesign — strong-form transformation requires honest assessment of what AI actually changes, which Goodhart dynamics prevent
- The AI Great Leap Forward — source clipping developing the full analogy