Sonnet 4.5 - New model is available in Cursor

There seems to be an issue with the context calculator in the Agent. I am still on 1.6.27, and cannot upgrade due to the terminal issues. So maybe this was fixed. But right now, on claude-4.5-sonnet :brain: I see 176K listed as the context window when it is supposed to be 200K:

Screenshot 2025-09-30 at 11.27.01 AM

The grok-code-fast-1 :brain: model shows it only has a 204.8K context window, when it is supposed to be 256K:

Screenshot 2025-09-30 at 11.27.49 AM

GPT-5 seems to show 272K context in the usage meter, and it does indeed have a 272K context window. Not sure why Grok Code and Claude 4.5 are both hamstrung, though. I seem to rack up context usage pretty fast on BOTH models, and now I’m wondering how much I’ve been suffering due to incorrectly calculated context windows.

1 Like

Tried this one - it ate my subscription in a matter of couple of hours coding :sad_but_relieved_face: . $2+ on a request, that’s crazy! Context is below 300k

4.5 is disappointing, even in Claude Code. I don’t trust it after having used codex.

It found one bug, tho.

1 Like

How did Claude 4.5 Sonnet blow my Ultra subscription limits in a couple of hours? Cursor administration, can you restore my limits? This is nonsense. My subscription was enough to use GPT-5 models, but then I decided to test Claude’s new model, and it literally devoured all my money in an instant without doing anything useful.

7 Likes

It looks like Claude Sonnet 4.5 in your case had very poor caching quality, or you were opening new chats and tasks too frequently.

I don’t know what the problem is exactly, but the fact is that the model really ate up all the limits in a couple of hours. I don’t understand how this happened, and I didn’t set her too many tasks or open dozens of chats.

That looks like close to 20 million tokens. Its $3/mtok input and $15/mtok output. You blew through 20 million tokens, bud. Cost wise, that is what, $400? If you only paid $50, nicely done!

You need to learn how to use cursor more efficiently. There is a lot of talk about starting new chats, however, there is benefit to keeping an existing chat going for a while as well (boosts your cached token usage, which are 1/10th the cost.)

2 Likes

My $200 Ultra subscription has always been enough for a full month of GPT 5 models. I don’t understand how the Claude 4.5 Sonnet model ate up the monthly limit in just a couple of hours. It’s simply unrealistic, impossible, and shouldn’t be happening. I didn’t give it any hyper-complex tasks that required that much computation, yet it managed to squander so many tokens.

I’ve been using Cursor almost since its first month and I know exactly how to use it. It’s not me, it’s just some kind of tokenization error or something else.

1 Like

It is not likely a tokenization error as we are receiving exact tokens you consumed as reported by Anthropic. It would be widely known if Anthropic had an tokenization issue.

Most probable cause is context and token management combined with Max mode and thinking model selection. Also it is always advisable to monitor token usage of a new model when starting to use it.

Are the coding tasks really requiring combination of Max mode and Thinking mode?
Have a look if some of my suggestions here help

2 Likes

I’ve been using Sonnet 4.5 since yesterday, and so far, I really like it. The second biggest complaint I had with Sonnet 4, speed, seems to have been largely resolved. This 4.5 model is quite fast…maybe not as fast as Grok Code, but fast enough that I’m preferring Sonnet 4.5 now.

I am really appreciating how it handles things. I have always thought sonnet had a more natural…”intuition” (not that it actually has that), as it just seem to understand prompts better than any other model I have used, for the way I prompt, for the work I do.

Its been solving problems very well. It has been handling problem resolution much better, and with just a better approach, than either Grok Code or GPT-5 (any of them.) I don’t really have any complaints thus far. (Well, other than cost of course, this thing is literally 10x the cost of Grok Code!!)

The speed boost has kept me on this model alone since I started using it. I use it for everything again, and its been quite nice so far. I’m sure at some point, its going to go bonkers off the rails and screw things up…every model does. We’ll see how it fares once it does, in comparison to other models, though…

1 Like

Max mode consumes larger context windows, which quickly increases your costs (including Cursor’s markup %). Avoid using Max unless necessary.

1 Like

Last month I also used Max Mode with GPT-5 High and it didn’t create such problems with token consumption, but this time I just didn’t set huge goals for Cloud 4.5 Sonnet so that it would eat so many tokens. I think it’s basically unrealistic to spend an Ultra subscription in a couple of hours, that’s some kind of nonsense.

I have been using it all day, and I see it still makes so many mistakes, it is nothing near Codex, I use codex CLI and it is the best thing, it really think through the tasks and gives back quality and finished code.

2 Likes

It can’t possibly. At 51.8% context now (MAX) it’s slower than molasses in January. Brutal. Has given me a good session today until about mid-day. Traffic? Server pressure? Or context related?

Claude Thinking really solves some problems, but it’s too expensive for me. The last time I tested it (I underestimated the complexity of the task, but still), neither Sonnet 4.1 nor gpt-5-high could handle the task, but gpt-5 could not handle it for several times less money.

2 Likes

@condor So, there is an issue with the @Docs feature in Cursor. IT does not work anywhere. It may not have ever worked anywhere except with Claude before, however now it doesn’t even work with Claude. I haven’t had a chance to test and figure out when it was last working, I suspect it may not have since 1.3.x even, but it might have been working in 1.4.x at some point.

I only really recall it ever working properly, with Claude in the past. I don’t think I have ever seen it work with GPT-5 (which I spent a lot of time using) or Grok Code (which I have spent quite a bit of time using.) I started using sonnet again a few weeks ago, in concert with Grok, and at first I wasn’t paying attention, but as I started noticing the model having trouble with tasks in the past that it did SUPERBLY WELL with when indexed docs were supplied, I looked into it more, and realized that @Docs simply does not work.

NO models, not even sonnet 4.5 here, seem to actually look at the docs. In the past, Sonnet actually had a very explicit interaction, would show you the chapters it was referencing for the task at hand, and it REALLY helped, a ton. Docs were WAY better than web search, which is marginally useful at best, assuming the model was even able to get useful hits in its searches.

I had hoped that Sonnet 4.5, would work with docs again, but it does not. This is really hampering my flow, as I now have to search myself for relevant docs, then, when I find web pages, it doesn’t seem as though ANY of the models can actually reference them at all. So I have to tell the models to use HTTPie at the command line, to download the web pages, and analyze them for their non-html content, to actually glean the necessary information for the model to do a good job on docs-dependent tasks. Its totally ridiculous, when you guys have an EXISTING feature built right in, that should be CRUSHING this need with perfect results every time (it certainly seemed to before.)

Please look into the Docs feature. I don’t know when it broke, but, it broke so hard it is like it doesn’t even exist anymore.

@jrista team is working on @Docs Context commands are broken - #6 by deanrie

Ah! Great to hear! Thanks for the update!

1 Like

So, I am honestly quite excited about Sonnet 4.5. It seems it has become a truly INTERACTIVE model/agent now. When the model needs to know something, and it determines it cannot acquire the information itself, it will REQUEST the information from the user, then STOP and wait for the user to supply the necessary information! I have very rarely seen that with GPT-5, never with Grok Code. However a lot of what I’ve been working on lately, is manual testing and debugging of processes that an agent simply couldn’t test and fix on its own.

Grok Code, and even mostly Sonnet 4, were not “interactive” enough for this very manual process to work. However Sonnet 4.5, has veyr happily integrated into this highly “stop-and-go” process, and it has no problem checking as much as it can, and when it needs information it cannot get itself, it asks and waits!

I think THAT kind of “conversational” and “interactive back-and-forth” approach, is going to become CRITICAL in the long term for effective agentic software development. Glad to see Anthropic is paving the way there.

2 Likes

The new model seems to work well, but something seems realy wrong with the pricing.. I have Ultra plan and tested a few small task this morning, maybe 5 messages with claude 4.5 thinking max on, and it says already this:

image

1 Like