Is it only me or is Claude 4 Sonnet way too dumb lately?

matgenehr · June 26, 2025, 3:02pm

I have to rewrite commands, stop it from generating responses, correct it, make it find solutions to simple answers that it was supposed to address easily..

I kept it running and it deployed a script deleting all of my functions from firebase.. completely out of what i have asked. wtf?

Aleks_Sergeenko · June 26, 2025, 4:46pm

I noticed the same thing too. Just 3-4 weeks ago everything was working fine, but now I have to fix even simple scripts 5-6 times. It feels like it’s been intentionally made “dumber” so that users would spend more tokens. There’s simply no way a model could start making this many mistakes literally within a few weeks. And I’m not even talking about other issues.

T1000 · June 26, 2025, 5:07pm

Hi @matgenehr @Aleks_Sergeenko

No, Claude 4 Sonnet is not getting dumber, it performs really well, I use it daily without such issues on the personal Pro plan.

Creating a thread in Bug Reports would be great to help investigate your issue properly.

Cursor Team has stated repeatedly that there is nothing making models dumber and not intentionally. With a Bug Report the issue could be checked and made sure that models work well.

Please feel free to tag me there with @ so I don’t miss your Bug Report.

Lacakdaisical · June 26, 2025, 5:14pm

Depending on which instance of Claude you’re getting you will get a different level of performance out of it.

cocode · June 26, 2025, 5:16pm

Not dumber. I confirm that it is amazing. Just works next to prefect.

matgenehr · June 26, 2025, 5:35pm

I think this is not a bug, I think this is related to the model itself.

T1000 · June 26, 2025, 5:36pm

haha I hear you offer still stands to discuss more detailed whats going on

matgenehr · June 26, 2025, 5:37pm

yup. claude 4 opus is a gigantic leap to sonnet, imo.

claude 4 sonnet is really good, but reminds me the way claude 3.7 sonnet worked.

matgenehr · June 26, 2025, 5:38pm

sure, thanks. this is something related to the reasoning capabilities of each model.. not cursor itself.

matgenehr · June 26, 2025, 5:39pm

it is really good. it just does not perform very well on complex tasks. gemini 2.5 pro is way better on debugging complex tasks.

i’ve had bugs that claude 4 sonnet did not fix even after 5 attempts. 1 attempt with gemini and it was solved.

matgenehr · June 26, 2025, 5:39pm

that’s crazy, huh

nightcoder · June 26, 2025, 5:40pm

I see the same pattern in the last week. Hallucinating more then ever, doing things you tell it not to even multiple times, telling it to continue, asking it is it stuck (and with very simple things). I am on the point that I am asking myself if I am just scammed for more and more requests.

normalnormie · June 26, 2025, 5:41pm

If the issues are related to following rules, yes, it has been less consistent in following them than the first two weeks, it’s possible that the cause isn’t Cursor(like system prompts), I’ve read an user in reddit that says new models have their full capacity at start to show muscles and get degraded slightly to optimize resource usage(and monetary extraction) -but I’ve not seen yet research confirming this-.

Lacakdaisical · June 26, 2025, 5:42pm

Its not cursor but the models themselves, you need to implement strict rules and knowledge for the model upon initially calling it, sometimes it will swap a fresh instance, when it does that the new instance will have no recollection of your convo/history even in the same chat unless you tell it to read the history.

nightcoder · June 26, 2025, 5:43pm

Trust me, I have cursor rules, tests, even fitness functions, as DDD as I myself could do it and I believe I can but… since last two updates…

Lacakdaisical · June 26, 2025, 5:44pm

Create a memory log for the agent to use to rehydrate content and refocus

T1000 · June 26, 2025, 5:48pm

Interesting I stopped using rules. Feels like there is no need for them anymore with latest Claude 4 Sonnet Thinking.

Well, I have one user rule now: Follow SOLID & DRY

Lacakdaisical is right, files for info storage: feature details, progress log, implementation plan. I ask Agent to read MD files I create or to write its own updates.

Usually my prompts are 1-2 sentences and then it works on that for 1-3h.

Implement plan for feature {approximate filename}.

Not having any of the issues you all mention.

Feel free to pick my brain!

T1000 · June 26, 2025, 5:55pm

After all the issues with Sonnet 3.7 current 4 is a fresh breeze

matgenehr · June 26, 2025, 6:07pm

i don’t like using memories. if they screw up, they save the memory with the error thinking that it was implemented right. i keep memory saving active, but constantly delete them all.

Lacakdaisical · June 26, 2025, 6:08pm

Oh, I didn’t mean the cursor memory, I have that off myself, but rather a file that stores what you need done/specifications/ect.. to add to T1000’s comment, lets say you put a prompt/implementation plan/ect into a file, tell the model to “run filename.md” or whatever extension your using and it will update, in the file you can tell it to check chat history to refresh memory and other things you need, add for it to update the file after delivery/implementations for consistency.

Topic		Replies	Views
Ridiculous excitement over new models when none of them work very well and sonnet has gone backwards too Discussions	8	343	February 15, 2025
What's going on with Claude 4 Sonnet? Discussions	7	1471	June 18, 2025
New Claude 3.5 already worse? Bug Reports	5	1368	October 28, 2024
People, Your Honest Opinion Discussions	23	2513	March 18, 2025
Do you feel that cursor has become stupid, mental health Bug Reports	4	250	May 23, 2025

Is it only me or is Claude 4 Sonnet way too dumb lately?

Related topics