The flat-rate pricing of $0.05 per prompt plus $0.05 per tool call creates some strange incentives that affect how the system works.
If you’ve been following the forums, you’ve probably seen the ongoing discussions about context handling. Many users have noticed changes in how files are included and how context is managed between versions. When asked directly about these changes, the response from the team was:
"here’s what we can share!
we unified chat and composer, having both modes run through a single unified prompt
improved summarization to make it more efficient
new Sonnet 3.7 caused us to do some prompt tuning
also increased context window for 3.7 to 120k and 3.7 max mode to 200k"
This kind of answer, while acknowledging some changes, doesn’t address the specific concerns about how context is actually being handled and what “more efficient” summarization really means for our workflows. @codebase was removed and users are noting that their files aren’t going right into the context window, requiring tool calls (I can’t confirm this, as I haven’t re-subscribed yet).
Here’s my core concern: If Max mode is simply meant to “pass along the cost” of the API (as suggested in the forums), why use flat-rate pricing? Whether I use 10k tokens or 190k tokens in my context window, I pay the same $0.05 per prompt. This creates a situation where Cursor is incentivized to limit what goes into context to control their own costs, even though the expanded context is one of the main selling points of Max mode.
The simple solution? Token-based pricing that mirrors how the API providers actually charge. This would:
- Create perfect alignment between Cursor’s costs and user pricing
- Give users full control over their context usage (and associated costs)
- Remove any incentive for Cursor to limit context for cost reasons
- Provide natural transparency about what’s happening with our context
I understand there are UI considerations and that implementing token-based billing might be complex. But the current model feels like it’s working against the very benefit Max mode is supposed to provide - expansive context.
What do others think? And Cursor team - would you consider a token-based pricing approach for Max mode in the future?
if you don’t, I’m going to be skeptical of the intentions behind the current pricing. Currently holding off on re-subscribing until this gets at least some sort of response by the team.