AI
Three in Four Companies Can’t Track Their Own AI Token Bills
KPMG finds 74% of companies lack clear AI token cost visibility, as Uber drained its annual AI budget in four months and Amazon deleted a usage leaderboard.
Three in four companies have no comprehensive view of their AI usage costs, with just 26% of enterprises reporting full billing visibility, per a new KPMG survey reported by the Wall Street Journal. The finding arrived the same week Uber disclosed it had burned through its entire 2026 AI budget in four months, Walmart began rationing tokens to employees on its in-house AI agent, and Amazon quietly shut down a usage leaderboard that had pushed engineers to inflate their own token consumption at company expense.
Bills That Arrived Without Warning
Steve Chase, KPMG’s global head of AI, told the Wall Street Journal the consulting firm is working alongside companies that have already consumed annual token and cloud budgets in a matter of months, with one client reporting a sixfold jump in token consumption. He described AI billing as “a new resource that needs to be managed that didn’t exist quite that way” as enterprises now encounter “exponential growth.” KPMG’s survey covered C-suite and senior leaders at organizations with annual revenues above $1 billion; 50% said they have only partial visibility into AI costs, and 22% said they see the numbers only when the invoice arrives.
Uber is the most public case. Praveen Neppalli Naga, the company’s chief technology officer, told The Information in April that Uber had already consumed its full-year AI allocation in roughly four months. Engineers had adopted agentic coding tools including Anthropic’s Claude Code at a rate far faster than the company had projected when setting its budget. Naga told The Information: “I’m back to the drawing board because the budget I thought I would need is blown away already.” Andrew Macdonald, the company’s chief operating officer, added that the company cannot draw a clear line from its rising token spend on the agentic coding tools to concrete improvements for riders and drivers.
Walmart hit the same wall a week before the KPMG figures published. The retailer had given employees unlimited tokens on Code Puppy, an in-house AI agent built for tasks including spreadsheet analysis and presentation drafting. After demand exceeded projections, Bloomberg reported June 1 that Walmart replaced unlimited access with a fixed token allocation per employee. Suresh Kumar, Walmart’s global chief technology officer, told journalists at the company’s annual shareholder meeting that the firm was seeing “a plethora of repeated inquiries” from employees and wanted workers to stop posing identical questions to the agent repeatedly.
Running AI Agents Multiplied the Cost Faster Than Budgets Moved
A token, the unit AI providers use to meter consumption, is roughly equivalent to three-quarters of an English word. A simple question-and-answer exchange might cost a few hundred to a few thousand tokens. The same work handed to an agentic coding tool (one that plans, writes, tests, and revises code in an autonomous loop) can consume hundreds of thousands. Per a Goldman Sachs analysis, agentic AI workflows may push token demand 24 times higher than traditional prompt-and-response interactions.
The consumption multiplier accelerated in November 2025, when multiple AI labs released substantially improved agentic model versions. Companies that had been running modest experiments through the year suddenly had access to tools capable of much longer autonomous work cycles, and adoption spread quickly across engineering teams. Token bills from those experiments began landing in Q1 2026.
Three structural features of per-token billing catch finance teams off guard:
- Per-seat licenses charge a flat fee regardless of consumption; per-token billing charges for everything every user and every autonomous agent they run actually processes.
- Agentic loops multiply themselves. Each request from an AI agent can spawn several sub-requests, and each consumes tokens, so cost per task scales non-linearly as autonomy increases.
- Usage spikes are invisible in real time. Without dedicated monitoring tools, a team running hundreds of parallel agents may not see what they owe until the billing period closes.
J.R. Storment, executive director of the FinOps Foundation (a professional body under the Linux Foundation that develops financial-operations practices for cloud and AI spending), told TechCrunch in early May what that gap looked like from inside enterprise finance teams. “In April and May, I started hearing from companies: ‘Oh my god, we are 3x over our entire 2026 token budget and it’s only April,'” he said.
OpenAI’s CEO Named a Problem His Clients Were Hiding
At OpenAI’s “Intelligence at Work” enterprise livestream on June 2, the company’s own chief executive put the billing problem on record.
People are really saying, you know, it’s kind of a meme now, but ‘My company spent my entire 2026 budget in Q1. Can you make this more efficient?’
Sam Altman, co-founder and chief executive of OpenAI, added that the cost issue had arrived “all of a sudden” from a start-of-year position where “nobody cared about costs, everyone was happy with what they were spending.” He said AI cost is now the second most common complaint from enterprise customers, behind requests to simplify workflows.
Six Years, a Million-Fold Token Jump
Six and a half years ago, the heaviest token user on OpenAI’s platform consumed roughly 100,000 tokens per month. Today, 100,000 tokens per month is the global per-capita average across all users. The current internal top user at the company processes around 100 billion tokens monthly; Peter Steinberger, developer of the OpenClaw application, publicly disclosed spending 603 billion tokens over 30 days. The New York Times reported one OpenAI employee used 210 billion tokens in a single week.
Per Axios reporting, one unnamed company ended up with a bill of roughly $500 million on Anthropic’s Claude in a single month after failing to install any usage limits for employees. TechCrunch described the same incident in separate reporting.
Why Opacity Helps the Vendor
Both major AI labs have moved primarily to per-token billing. Every chunk of text a model reads, recalls, or generates runs up a charge. Altman acknowledged at the June 2 event that the company maintains an internal token leaderboard, with some employees competing and posting their consumption totals publicly on X. OpenAI sells tokens by volume; the leaderboard culture inside the company rewards consuming as many as possible.
An industry consultant quoted by Yahoo Finance described the period before 2026 as “subsidized intelligence,” when AI labs priced below the true cost of compute to capture enterprise market share. As the major AI labs move toward IPOs, real compute-based pricing is replacing the subsidized version. For companies that built their AI workflows during that subsidized phase without metering infrastructure, the shift arrived with no prior warning in early 2026.
Leaderboards That Rewarded Waste
At Amazon, the internal AI usage leaderboard was called KiroRank, and it tracked employees on Kiro, the company’s agentic developer platform. When workers realized rankings measured how many tokens they consumed, some began generating unnecessary AI calls to inflate their scores; that practice spread under the informal industry label “tokenmaxxing.” Amazon had separately set a target of more than 80% of developers using AI each week, creating additional pressure that reinforced the gaming.
Dave Treadwell, Amazon’s senior vice-president of engineering, told staff the leaderboard had been “created with good intentions” but had driven up costs as workers padded their consumption. “Please don’t use AI just for the sake of using AI,” he said, as reported by the Financial Times. The company confirmed the tool had been “deprecated,” calling it “a beta dashboard… not a formal or approved tool.” Amazon’s replacement metric, “normalised deployments,” counts AI-generated code that actually ships to production.
The same dynamic was at work before the rideshare company’s budget overrun. The Information reported it had encouraged staff to use AI “as much as possible,” with internal usage rankings by employee; that competitive structure was still active when roughly 5,000 of its engineers began adopting agentic coding tools at scale in early 2026. Meta employees reportedly engaged in similar behavior on their own internal usage tables, per the Financial Times. At Amazon, the Financial Times further reported, some workers gamed KiroRank scores in part out of job-security concerns, using AI calls to appear productive amid broader workplace anxiety about the technology.
Who Is Building the Tools to Track AI Spending?
Earlier this month, as multiple large companies moved to contain runaway AI token costs, the corrections landed within days of each other. The rideshare company introduced a cap of $1,500 per month per employee per agentic coding tool, tracked separately per tool, with exceptions granted case by case. The retailer replaced Code Puppy’s unlimited token access with a fixed allocation, the specific amount undisclosed. Microsoft revoked its developers’ access to the coding assistant and announced a June 30 deadline to migrate them to its internal Copilot CLI toolchain, a move that aligns with the end of Microsoft’s fiscal year.
| Company | Action Taken | AI Tool Affected |
|---|---|---|
| Uber | $1,500/month cap per employee, per tool | Claude Code, Cursor |
| Walmart | Fixed token allocation (amount undisclosed) | Code Puppy |
| Amazon | Deleted KiroRank; adopted “normalised deployments” metric | Kiro |
| Microsoft | Revoked access; migrating to Copilot CLI by June 30 | Claude Code |
A spokesperson told Bloomberg the cap was designed to “responsibly encourage agentic AI adoption and experimentation at scale across the company,” with each employee able to track per-tool usage through an internal dashboard.
The FinOps Foundation is working to formalize what individual companies are currently improvising. Storment told TechCrunch the conversation has “shifted from tokenmaxxing and ‘go fast’ to ‘we need guardrails, how do we control this?'” Startups, established vendors, and a new standards effort within the Foundation are racing to give enterprises a common framework for tracking AI spend across platforms and departments.
Recent earnings calls reflect the same pressure surfacing at the CFO level. Shopify, Spotify, ServiceNow, and Roku all cited AI as a growing share of operating costs in recent quarterly results, per analysis in The Atlantic. JPMorgan published a research note titled “AI Token Costs Are Eating Internet Profits Alive.” Ramp’s May spending index showed Anthropic overtaking OpenAI in overall US enterprise AI adoption for the first time, at 34.4% versus 32.3%, with the coding tool cited as the primary driver of that growth.
Per the KPMG Q4 AI Pulse survey from January, 67% of large-company leaders plan to maintain AI spending through a recession, projecting average annual AI deployments of $124 million.
-
CRYPTO1 month agoAndreessen Horowitz Bets $2.2B on Crypto’s Quiet Cycle
-
CRYPTO4 weeks agoCathie Wood Calls SpaceX IPO Demand ‘Voracious’ Ahead Of $1.75T Debut
-
NEWS1 month agoGhana CSA Plants Office In Ho As Volta Cybercrime Climbs
-
NEWS1 month agoHormuud Bets $19 Down Will Finally Pull Somalia Online
-
APPS1 month agoGoogle’s Buried Page Reveals 500 Niche Websites Still Making Cash
-
NEWS1 month agoApple Strikes Preliminary Deal For Intel To Make iPhone And Mac Chips
-
NEWS1 month agoMetalenz Polar ID Hides Face Unlock Under OLED Smartphone Screens
-
AI1 month agoGoogle AI Overviews Adds Subscribed Label, Reddit Quotes Inline
