Not very happy with how the pricing turned out. You can forget about GPT-4.1, that is dead (costs full use). o4-mini, while capable, is not looking great when calculating the credit price of full use.
I’m expecting Cursor will drop o3-mini soon, otherwise people would still be using it, because it represents a much better value than o4-mini and GPT-4.1. I interpret this as a similar fail as with R1, even if we count in thinking, why does o4-mini - 3 times costlier in API - have the same credit price?
I guess for a cheaper every-day model I am staying on o3-mini as long as possible. Then not sure, might skip this 1/3 credit category altogether. There’s only Haiku which is worse than V3 and Grok 3 Mini.
For a free one, Grok 3 Mini looks okay, slightly better than V3 (why doesn’t Cursor have V3.1 already?).
For full use, I will be trying Gemini 2.5 Pro. From using it for a few hours, it seemed capable, though I’m not sure if it’s better than Sonnet 3.7 as benchmarks indicate. o4-mini seemed fine, but reaction time was tragic - waiting 3-10 minutes for a few lines of code is not a premium service. Also Cursor’s pricing left a bitter aftertaste from o4-mini…
I agree with you that GPT-4.1 should be cheaper. If it were 0.5 credits per use I’d probably use it quite a bit. I really enjoyed the free period with that.
And while o4-mini is indeed quite slow, I’ve found myself using it almost exclusively lately because of the quality of the research it does in gathering context and the code that it produces. It seems to be getting a bit faster, or maybe it just seems that way now that I can now distract myself by reading what it’s thinking sometimes. Gemini 2.5 is also good, and quite a bit faster, so I’ll use that for more simple stuff. I don’t think I can ever go back to Claude 3.7 thinking because of how ADHD it is. (I used to use it almost exclusively, and some of my codebase still suffers from the bloat it introduced.)
I think I forgot to post here. Added few new icons (recommended, warning, removed), all can be disabled.
I still believe o4-mini is overpriced, or at least the use cost unexplained (and threads on forum shadowbanned).
I thought it must be wrong, but in the new table they list o3-mini as 0.25, it used to cost 0.33 (third). I have no idea what Cursor is doing with those numbers. Only thing that could explain it are some special deals with AI companies.
o4-mini seems to chew on problems for much longer than other thinking models. Claude and Gemini, you can see what they’re thinking. o4-mini only gives you summaries of what it’s thinking (if anything), probably because the full output would be TLDR. That might explain the higher cost.
Yes, it feels a bit slower. Though when Sonnet 3.7 decides to implement 4 extra features, even the o4-mini is faster than.
The thinking cost should be covered by the “Thinking Coef” which expresses how many more tokens are used for thinking compared to if thinking were not used. From that, other values are calculated, like “True Output Cost ($/1M)”.
So from these numbers, it makes no sense why should o4-mini be priced as a full use when it is cheaper (lower thinking coefficient) in API compared to o3-mini.
I suspect either Cursor got some insane deal on o3-mini, or they are just increasing prices because they can. Competition is still not on their level (though water competitor recently announced custom models, biggest one should be on par of sonnet 3.5 and cost only a fraction; helper code competitor recently raised sub 30->50$, and I read very different polarized opinions on quality)
This is slightly off topic, but I tried out o3 in MAX mode today to see what it would cost me compared to the old model of 0.05 US cents per tool use. I asked it a single question, and it used 18 tool uses (mostly reading various files). Would have cost me $0.95 in the old model. Then I checked how many credits it charged me on the online dashboard. 53.2 credits! So $2.13…more than double what the old model would have charged for the same o3 prompt. One prompt gobbled up 1/10 of my monthly credits.
Granted, o3’s per-token cost is like 3 times the next cheapest model, so maybe I was just getting a good deal before for o3, where I was paying the same per-tool-use cost as I would have for Claude or Gemini.