Who Should Pay for Token Waste?

I want to raise a concern about the current pricing strategy based on model API usage, because I think it creates the wrong incentives and unnecessary stress for users.

Right now, users are charged indirectly based on token consumption, but they have very limited control over how many tokens are actually used. In practice, the only real choice users have is selecting a model with known pricing. How efficiently that model is used — prompt size, retries, hidden context, tool calls, etc. — is entirely in the hands of Cursor’s implementation.

This setup creates a mismatch of incentives:

  • Users bear the cost and the uncertainty.

  • Developers are not strongly motivated to be token-efficient, because inefficiency doesn’t directly impact them.

As a result, users end up monitoring usage, worrying about accidental overconsumption, or avoiding features not because they’re not valuable, but because they’re unpredictable in cost. That friction hurts trust and usability.

If pricing were instead based on a fixed time period (subscription-style), the incentives would flip:

  • Users could focus on productivity without constantly thinking about tokens.

  • Cursor developers would be motivated to use model APIs responsibly and efficiently, because excessive usage would directly affect their margins.

In my view, that would be a healthier model for both sides: less stress for users, better engineering discipline for the product, and clearer alignment between cost and responsibility.

Curious to hear whether others feel the same, or if there are plans to better align incentives around API usage efficiency.

3 Likes

I absolutely and totally agree. We - Cursor users - don’t control the agent’s behavior. It can make one request or dozens. Once, when I was experimenting with the plans, Cursor used up the monthly limit in a short period of time (10 or 30 minutes).

So to me, the flat-rate plan is the only fair one, and sooner or later I am sure it will become the industry standard.