Anthropic is spending $5,000 to serve your $200 plan — bug or master plan?

Why many (including me) think cost to compute is the most important number for AI super users.

Aaron Heienickle

Apr 16, 2026

You come into your workday. You sit down to code.

You open Claude Code.

You type your first prompt of the morning.

Nineteen minutes later, it stops working. Wait what?

You get just... a polite little message that your session limit is used up, and to come back in four hours.

This is what started happening to thousands of developers on March 23rd.

Specifically, many developers paying $200 a month for Anthropic’s top-tier Max plan. AKA, the one that’s supposed to give you 20x usage.

One Reddit thread called “20x max usage gone in 19 minutes” hit 330 comments in 24 hours.

Another got 360+ comments in six days.

One subscriber commented, “It’s maxed out every Monday and resets at Saturday. Out of 30 days I get to use Claude 12.”

So, what is going on here?

Is this a bug? Corporate incompetence? Bait and switch?

Kind of. But also — and this is the more interesting answer — it might be one of the most deliberate strategic plays in tech right now.

Let me explain.

First, a stat that will blow you away

Before we get into Antrhropic’s master plan territory, there’s one stat that reframes this entire story.

Anthropic’s $200-per-month Claude Code Max subscription can consume up to $5,000 worth of compute per user, per month.

That’s according to an internal analysis by Cursor, reported by Forbes.

Some analysts think the ceiling is even higher — up to $25,000 for the heaviest power users.

You pay $200. It costs up to $5,000 to serve you. Anthropic eats the difference.

Every. Single. Month.

Imagine a steakhouse charging you $20 for a ribeye that costs them $100 to source, cook, and plate. They’re not running a restaurant — they’re running a very expensive customer acquisition strategy, and hoping they figure out the unit economics before the investors stop picking up the tab.

That’s AI subscriptions in 2026.

One developer dug through his own session logs and found he’d consumed 10 billion tokens across eight months of Claude Code usage.

At retail API pricing, that should have cost over $15,000.

He paid roughly $800 on the Max plan. A 93% subsidy — funded a massive venture bet.

So, why does this cause your session to die in 19 minutes?

Here’s where the bug vs. master plan question actually gets interesting.

The honest answer is possibly both at once.

On the bug side — a developer reverse-engineered the Claude Code binary and found two independent caching bugs that were silently inflating token costs by 10–20x. Real bugs. Confirmed by Anthropic.

Also, in early March, they ran a promotion that doubled usage limits for two weeks, then quietly ended it on March 28th — so users who got used to the doubled limits felt a sudden cliff when it expired.

Then, Anthropic added peak-hour throttling on top of all of that.

The four things hit at once.

So, yes, a list of things happened:

Some of the 19-minute sessions are a genuine bug.
Some are policy changes communicated poorly.
Some are the promotion hangover.

But beneath it all, there is a deeper reason why usage limits exist at all.

That is a $4,800 gap between what you pay and what Anthropic spends to serve you.

And, when the math doesn’t work, something has to give…

Okay, but why would any company do this on purpose?

This is Anthropic’s master plan part.

Think about how Uber launched. Rides were absurdly cheap for years because Uber needed you to stop taking taxis and start opening the Uber app every time you needed a ride. They bought your habit with subsidized prices, knowing the real monetization would come later.

Or think about Amazon Prime. When it launched, free two-day shipping was an economic disaster for Amazon. They lost money on shipping constantly. But they also discovered that Prime members spent dramatically more than non-members. The habit was worth more than the short-term loss.

AI subscriptions are running the same exact playbook.

Anthropic and its investors need your workflow. They need Claude to be the thing you reach for automatically — for code, for writing, for thinking through problems.

If they charge you the real price of $5,000 a month right now, then you don’t subscribe over competitors.

So, they charge $200, absorb the loss, and bet that by the time they need to close that gap, two things will be true:

You’ll be too dependent on the tool to leave,
And the actual cost of computing will have fallen enough for the gap to close on its own.

Here’s why that second bet is actually well-founded

This is the part of the story most people miss completely.

The cost of running AI — what the industry calls cost to compute, measured in cost per token — is falling faster than almost any technology in history.

In late 2022, running a GPT-4-class model cost around $20 per million tokens. Today, equivalent performance costs about $0.40.

That’s a 1,000x reduction in three years.

Faster than PC computing. Faster than internet bandwidth during the dotcom boom.

To put that in perspective: if airline tickets had fallen at the same rate, a cross-country flight would now cost about a quarter.

And it’s not slowing down.

AWS cut GPU pricing 44% in mid-2025. Spot GPU prices dropped 88% in less than two years.
NVIDIA’s next chip generation promises another 10x reduction in token cost.
Anthropic signed a deal with Google for up to one million specialized AI chips specifically to drive their compute costs down.

They are betting is a massive decrease in what it actually costs Anthropic to serve you one session.

Right now, that cost is $5,000 against your $200.

If computing keeps falling at its current rate, that same $5,000 worth of AI work might cost $200 to deliver in a few years.

Potentially, this could drop to much lower than $200 in further years.

That’s the bet, and their ‘master plan.’

So, what does this mean for you right now?

A few honest things.

If you are a Claude Coder, the frustration is completely valid.

Anthropic’s own team acknowledged people are hitting limits “way faster than expected” and called it their top priority.

The communication around the limit changes has been poor. The bugs are real.

If you’re a heavy user hitting walls constantly, you’re not imagining it and you’re not overreacting.

But you’re also, objectively, getting an extraordinary deal right now. One developer tracked $5,623 of API-equivalent compute in a single month of coding — and paid $200. Even with the limits tightening, the subsidy is massive.

The window where AI tools cost a fraction of their actual value won’t last forever. Anthropic needs the economics to work eventually.

Either compute costs fall fast enough that the gap closes naturally — which the data suggests is genuinely possible — or prices go up, or both.

We even got a preview of what abundance looks like: in December 2025, Anthropic doubled usage limits for a week by tapping idle holiday compute capacity.

Then January came and the limits snapped back. People thought something broke. Nothing broke.

The free lunch just ended for the week.

Bug or master plan?

I would say both.

The bugs are real and Anthropic handled them badly. The limits are real and tightening. The frustration of developers is completely earned.

BUT, the $4,800 gap between what you pay and what it costs to serve is definitely a crazy long-term bet — and one that makes a lot of sense if you believe (as the data strongly suggests) that the cost of compute is going to keep falling fast enough to eventually make the math work.

This is why I believe most important number to watch in AI right now is the cost per token.

Because that’s the number that determines whether your $200 plan stays a $5,000 subsidy forever — or whether the world where AI is genuinely cheap and abundant actually arrives.

As of the last few years, that number is falling like a stone.

Discussion about this post

Ready for more?