Opus 4.6 was fun for 1 week..... why?!

Sorry but I have to express my frustration here on the forum.

Claude Opus 4.6 was really supergreat for a week, but for more than a week now it’s only producing sh*tcode and only tokenspam, like Claude always did in their early days.

Can someone please explain to me howcome that newly introduced models perform great in their first week and after that it becomes total tokenspam?

1 Like

Opus 4.6 Thinking consistently rocks for me. Sonnet 4.6 Thinking is supposed to be cheaper but I don’t see that in my test cases. It seems to cost more, which is strange. Maybe it thinks more. It’s faster with heavy tools calls though and gets stuck in the terminal less often.

Hey, I hear the frustration. To look into this, I need a few specifics:

  1. Cursor version Help > About and your OS
  2. Which mode are you using, Agent, Ask, or something else?
  3. Concrete examples, can you share a prompt and the output you got? Even one or two cases would help a lot.
  4. Request IDs from the sessions that had issues right upper corner of the chat > Copy Request ID

Without concrete examples, it’s hard to tell if this is a model issue, a context or prompting issue, or something else. A few things to try in the meantime:

  • Make sure your project rules are set up to guide the model’s output style, see Rules | Cursor Docs
  • Try starting a fresh chat for new tasks instead of continuing a long thread, context buildup can reduce quality
  • Check if you see the same issue with other models like Sonnet, this can help narrow it down

Share those details and we can dig in.

Yes exactly the same problem here, Claude opus 4.6 was killer for a couple weeks back in February. Then they had the outage and that killed it, it came back spotty for as few times but now it’s just worse than chat gpt , Gemini and everything in design coding. It breaks the code more often then not, can’t fix anything anymore and most certainly can’t design a 6 phase system anymore like used to. It’s become horrible. Having to drop my Max plan ASAP unless they can get this back to the power it was.

1 Like

I am not sure is it related to version I am using (2.6.21) but i have really strange model behaviors lately. For start, Claude Opus 4.6 Thinking is burning tokens like crazy, not sure if something with context management happened but for pretty simple task I had pretty high consumption. Also, today I used plan mode with Opus (regular) and after it gave me plan it changed it Opus 4.6 Max Fast even though I don’t have it even enabled! I was in a hurry to go somewhere and waited for plan to get prepared to run build and was seconds from clicking on it. But even with “regular slow non max” Opus my credits are melting like crazy which is pretty weird because like 10 days ago I had some of the credits left in the plan and could not even burn them. Besides, what is with 1M context window that was mentioned to be charged at regular rates?
All this is on my Windows machine (I have specific compiler that does not run elsewhere) and I have one set of models available (including whole bunch of GPT-5.3 Codex models), while on version 2.7 on mac I have completely different set of models (e.g. just one Codex 5.3, not even labeled GPT). Is this how it suppose to work or is everything just become an uncontrollable mess?

Hey, a few things here:

Token usage with Opus 4.6 Thinking: thinking models generate extra reasoning tokens in the background before the final output. Those thinking tokens count toward your usage, which is why you’re seeing 8M+ per request. The high-thinking variant uses even more reasoning steps. That said, 8,2M and 8,7M for simple tasks does sound high. Can you share the Request IDs for those sessions? Top right corner of the chat > Copy Request ID. Then we can check if something’s off with the context being sent.

Plan mode switching to Max Fast: this is a known issue we’re tracking. To help us investigate your case, can you share the Request ID from the session where it switched to Opus 4.6 Max Fast without you enabling it?

Different models on Windows vs Mac: this is because you’re on different Cursor versions. v2.6.21 on Windows is behind v2.7 on Mac, and the model list updates with each version. Updating Cursor on Windows to the latest version should sync them up.

1M context window pricing: requests within your included plan allocation show as Included no matter the context length. If you go over your plan limits, longer-context requests may cost more in overages. The details are on your dashboard under usage.

Share those Request IDs and we can take a closer look.

Sometimes I wonder… A new model comes out, and people build impressive projects with it. As those projects grow (fast), they suddenly start noticing that the model is not performing as well anymore. Is that because the model has actually degraded, or because it helped make the project so large and complex? Would the same degradation also be noticeable in a new project?

1 Like

Yea, you have to pay a lot more attention to project management and tying up loose ends as the project gets bigger. Even with small projects, agents can leave a ton of loose ends.

I have countered the Opus 4.6 thinking unexpected expense by downshifting to Opus 4.5 thinking for design and planning iterations followed up with GPT 5.3-Codex-High for building out the plan. The combination is a good one.

4.6 for short threads is ok … but i have trouble discerning the difference in output quality versus 4.5

All 3 sessions are on the same composerID - e95ea33b-24e6-48e2-8ae8-745bec28934d. Cursor itself switched to plan mode even thought it was not supposed to be a big change (status handling change in code).
I cannot be sure anymore which session switched to Max Fast since I have used it quite a lot afterwards but it might be this one 96563f0f-8964-4249-848b-010dd1bf3512. For those reasons it would be nice to be able to disable some of the models on the account in general.
Regarding the different models on different versions I am not sure I want to have ones I have on newer version (2.7.0-pre.140.patch.0 - I switched while trying Glass), especially considering the fact I can use GPT-5.4 only medium but not high. Beside, last time I was trying to enable Glass on Windows it even did not worked and now I seen Glass and not sure I even wanna enable it anymore :person_shrugging:
1M context window pricing - isn’t it billed 2x more than regular? I recall seeing that information somewhere.

Hey, there are a couple of important points here.

Your Cursor version 2.5.20 from February 19 is noticeably outdated, you’re about 5 weeks behind. Since then, there have been lots of improvements to context handling, model routing, and overall stability. Updating to the latest version is the most effective thing you can do right now. It might even fix the issue you’re seeing on its own.

About the Request ID, I get that you don’t want to share project details, and that’s totally fine. Request IDs don’t reveal your code or prompts. They just let us check how the request was routed and whether the context sent was unusually large or incorrect. Without them, we’re basically guessing. Even a single ID from a session where Opus gave a bad result would help a lot.

Summary of what will actually move this forward:

  1. Update Cursor to the latest version and see if the issue still happens
  2. If it does, send at least one Request ID from a failed session right upper corner of the chat > Copy Request ID

I’d be happy to dig in as soon as we have that info.

Hi,
I updated to latest Windows version:
Version: 2.6.22 (user setup)
VSCode Version: 1.105.1
Commit: c6285feaba0ad62603f7c22e72f0a170dc8415a0
Date: 2026-03-27T15:59:31.561Z
Build Type: Stable
Release Track: Default
Electron: 39.8.1
Chromium: 142.0.7444.265
Node.js: 22.22.1
V8: 14.2.231.22-electron.0
OS: Windows_NT x64 10.0.26200

Must confess that it helpt a lot already, great!

Regards and thanks for the patience

1 Like