Highlights

  • Amazon shut down KiroRank after employees ran unnecessary AI tasks to inflate usage scores, a practice now called "tokenmaxxing."
  • Combined 2026 Capital Expenditure from Amazon (Nasdaq: AMZN), Microsoft (NASDAQ: MSFT), Alphabet (NASDAQ: GOOGL), and Meta (NASDAQ: META) is tracking between $650 billion and $700 billion.
  • Amazon has replaced raw token-count metrics with a new measure called "normalised deployments," focused on productive AI-driven output.

The Leaderboard That Backfired

Amazon has shut down KiroRank, an internal ranking system that scored employees on their use of AI tools on the company's Kiro developer platform, after workers tried to boost their scores with unnecessary activity that increased the company's computing costs.

Senior Vice-President Dave Treadwell told staff: "Please don't use AI just for the sake of using AI." The episode has become a case study in how productivity measurement during a major technology transition can generate the opposite of its intended outcome.

Amazon has since moved from raw token counts to a metric it calls "normalised deployments," designed to measure meaningful AI-driven work rather than raw activity Volume.

Tokenmaxxing and the Goodhart Problem

The practice employees developed in response to KiroRank has acquired a name. The practice, dubbed "tokenmaxxing," has become widespread enough to generate its own vocabulary, and raises a structural question: if a meaningful share of AI consumption is performative, how reliable are the Demand figures against which hundreds of billions in AI infrastructure procurement are being allocated?

The Amazon story is being described by analysts as a textbook case of Goodhart's Law: the principle that when a measure becomes a target, it ceases to be a good measure. The moment token consumption was tied to leaderboards that managers could see, it stopped measuring AI productivity and started measuring competitive anxiety.

Amazon said usage statistics would not Factor into performance evaluations, but multiple employees said they believed managers were monitoring the data, with one describing "so much pressure to use these tools" and another citing "perverse incentives."

Not Isolated to Amazon

The measurement problem is industry-wide. Meta employees engaged in similar tokenmaxxing behaviour, competing on an internal leaderboard called "Claudeonomics" that ranked the company's roughly 85,000 workers by token consumption. In a 30-day window, total usage on the dashboard exceeded 60 trillion tokens. The leaderboard was taken down after reporting by The Information.

A May 2026 report noted that almost every Fortune 500 company is now tracking overall AI usage, with tokens, prompt counts, licence activations and seat-utilisation rates becoming standard surveillance inputs alongside older metrics like keyboard activity.

The financial logic driving this surveillance is straightforward. Amazon faces AI infrastructure bills of nearly $200 billion expected in 2026, while making cuts elsewhere, including layoffs, to keep expenses in check. Every executive who has approved commitments of that scale has an obligation to demonstrate that adoption is occurring at pace.

The Investor Implications

The leaderboard episode raises a question that extends beyond internal HR policy. Wall Street projections for combined hyperscaler capex exceed $1 trillion for 2027, and every hyperscaler has told investors that inference capacity is being absorbed as fast as it can be deployed. If a portion of that inference demand is performative rather than productive, the demand assumptions underpinning those projections require scrutiny.

The risk is not that AI adoption is failing. It is that crude metrics are overstating the quality of adoption while raw usage figures feed infrastructure procurement decisions. Investors in AI infrastructure beneficiaries, including data centre operators and semiconductor suppliers, should weigh whether stated demand growth reflects genuine productivity deployment or a measurement artefact.

For Amazon specifically, the headline impact on AWS Revenue is limited in the near term. The shift to normalised deployments may over time produce more defensible adoption data for enterprise customers watching how Amazon manages its own AI workforce as a deployment template.

A Structural Warning for Corporate AI Deployment

The financial stakes behind the pressure to show AI adoption are enormous. The moment token consumption was tied to leaderboards visible to management, it stopped measuring AI productivity and started measuring competitive anxiety. HR leaders designed this not maliciously, but the incentive structure that produced tokenmaxxing is a people management failure, not a technology one.

The episode underscores a broader maturation challenge for enterprise AI. Tools that demonstrably reduce time on coding, document drafting, and Customer Service workflows have genuine productivity value. But measurement frameworks designed to justify capex rather than capture output quality will consistently produce distorted signals. Amazon's pivot to outcome-anchored metrics is a recognition of that reality, and a signal to the broader market that usage volume alone is an inadequate proxy for AI's true productivity contribution.