The price of AI tokens has been a running story all year. The companies that raced through 2025 to put a large language model behind every button now spend 2026 working out what it all costs โ and how to make the bill stop climbing. In a June TechCrunch report on the industry’s scramble to contain those bills, the FinOps Foundation’s J.R. Storment said some companies were already 3x over their entire 2026 token budget a third of the way into the year. The conversation, he noted, had turned from going fast to putting up guardrails. Uber burned through its 2026 AI coding budget by April; Microsoft gave its developers Claude Code, then revoked the licenses months later to rein in spend. The mood has flipped from “use more” to “how do we control this?”
Those companies buy AI by the token, metered โ the more they use, the more they owe. Individuals like me are on the opposite deal: a fixed monthly subscription, $200 for the Claude Max 20x plan, the same price no matter how hard I push it. A month ago I set out to see what my flat-rate usage would cost billed their way. I built a small tool, pointed my machines at it, and let it run. Thirty days later, the number is in.
What the meter measures
Claude Code โ Anthropic’s command-line coding tool โ shows a figure called the API-equivalent cost: what a session would have run at its published per-token rates. For a metered customer it’s the actual bill; on my plan it’s hypothetical.
My tool, codecostmeter.com, records that figure across every computer I work on, rolled up by day and by model. Only the usage math leaves my machines โ token counts and the dollar estimate, never a prompt, file, answer, or login. One honest limit: it sees only Claude Code in a terminal, not a browser tab or a desktop app โ but I rarely use those, so it catches nearly all my usage.
Thirty days: $3,959.78
Across my first month on the meter โ May 24 through June 22 โ my API-equivalent usage came to $3,959.78. My employer paid $200 for it: a discount of very nearly twenty to one.
I leaned on it all 30 days; not one was off. The work moved roughly 4.48 billion tokens โ 34.8 million of input, 31.8 million of output, and the rest, over 4.3 billion, cached context re-read on every turn (more on that shortly). My heaviest single day was Sunday, June 21, at $299.07; weekends ran the hottest, which won’t surprise anyone who does their deepest thinking once the meetings stop. The daily average was about $132.
The early-Uber comparison
I keep circling a comparison that does my bargain no favors. In the early 2010s, an Uber felt suspiciously cheap. In San Francisco in early 2015 you could take an uberPOOL anywhere in the city for a flat $5, no catch, even when nobody shared the route. Those rides seemed like the future arriving early. They were also losing money on purpose: by one widely cited analysis, 2015 passengers covered only about 41% of the true cost of their trips, with venture capital paying the rest โ the “millennial lifestyle subsidy,” investors quietly buying you a cheaper life to capture the market first.
A flat-rate frontier-AI plan has the same shape. A model like this spends real compute on every answer, and $200 nowhere near covers a month of what I push through it. Someone absorbs the gap โ for now, to win the market, the way Uber once did. The twist: that same company drowning in AI token bills today once sold cross-town rides for five bucks at a loss. The difference this time is that I can watch my subsidy to the penny, because Anthropic built the meter into the product.
What you run matters more than that you run it
If those totals sound alarming, here’s the reassuring part for anyone weighing AI for real work: the model you pick and the size of its context window move the number far more than the act of using it.
My earliest AI-powered site, thanksbutnope.com, writes polite, context-aware declines to the vendor pitches that fill my inbox. It runs on Claude Haiku 4.5 โ older, smaller, cheaper โ on Amazon Bedrock, called several times a day. Its entire inference bill last month, at full API pricing, was nineteen cents of real money. The economy models stay remarkably capable for a great many jobs, and they cost almost nothing.
My day-to-day sits at the other pole: Claude Opus 4.8 with a one-million-token context window at maximum effort โ the most capable setup an individual subscriber can reach โ because the flat plan lets me. Fresh in the morning, the meter rises slowly, and I’m often under $30 by lunch. But once a single session fills around 70% of that window, the cost leaps five or ten dollars every few minutes. The cause is mechanical: each turn re-reads the whole context, so a nearly full window reprocesses close to a million tokens every time I press enter. That one behavior accounts for most of the 4.3 billion cached tokens above. As I write this, today’s reading โ an ordinary day across a few projects โ sits at $160.
Cheap while it lasts
I don’t believe a flat $200 plan that multiplies my own output several times over lasts forever. The enterprise side is already metered, the consumer plans carry weekly caps for the heaviest users, and the dollar figure Anthropic now shows me reads like getting customers used to a number before it becomes a bill. My honest take: for the foreseeable future, the individual subscription is about as cheap as frontier-model access is going to get.
So I’ll wring every dollar of value from it for as long as I can. And if $3,959.78 in a month sounds like a lot of leaning on one tool โ I have a colleague on the same $200 plan who says my token consumption is significantly less than his.
This post took $21.41 to produce, by the meter.
