Claude sonnet 4.0 Vs Claude sonnet 3.7

lucksilver79 · May 30, 2025, 7:08pm

I noticed that Claude Sonnet 4.0 is doing more things than I asked for, compared to version 3.7. Has anyone else noticed this?

Cpyrighted · May 30, 2025, 7:16pm

I noticed Claude 3.7 is now as dumb as a box of rocks and fails to do anything and the Google 2.5 Pro model is still unstable, o4-mini ■■■■■ at UI so is no good for my use case.

Only model working well is Grok-3 so that’s what I’m using, I did want to use the new R1 0528 model but Cursor or the model does not support Agent tools for it so it’s unusable for coding tasks.

lehuygiang28 · May 31, 2025, 7:04pm

I feel like 3.7 is ten times dumber than yesterday…!? Whats going on?

Cpyrighted · May 31, 2025, 7:16pm

No idea, but only models for me that are actually competent are Grok-3 for my frontend development and refactoring & o4-mini for backend stuff. All the rest are pretty bad, Gemini 2.5 Pro just makes mistakes or gives up & Sonnet 3.7 fails to actually fix anything unless I tell it to edit using terminal commands & even then it does unnecessary things despite very specific instructions.

wkea · May 31, 2025, 7:29pm

Yes, during my development process, when it completes certain features, it always writes a test script to verify if the functionality works properly. But here’s the problem: if the test script doesn’t go smoothly, it will write even more test scripts to fix this test script. That’s the first issue.
Another thing is that it prefers writing project documentation now. Even if you don’t ask for it and just request code changes, sometimes it will write documentation for the code it modified.
Also, it keeps trying to use “&&” in PowerShell, causing the first command in every supported instruction to fail to execute properly (this issue also exists in version 3.7).

LMJTM · June 1, 2025, 4:29am

Same with 4.

3.7 had connectivity issues for the couple weeks leading up to 4. After 4 released, the connectivity issues persisted for 5 days, then was completely fixed and my productivity rate spiked into the double digits of the rate I had been stuck working at. Yesterday the connectivity issues came back and it’s like all Claude models are going perfectly counter to any instructions given to them.

If this is a Cursor issue, they NEED TO STOP FORCING AUTO UPDATES!!! and revert their changes. They’ve even announced that they stopped forcing auto updates and gave an option in the settings to allow auto updating, but it still does it even with the option turned off. The models gaslight customers enough as it is, we don’t need the humans at the companies which provide their usage to do the same.

If this is a model provider issue, Cursor needs to contact Anthropic and push them to fix the issue.

EDIT: It seems that Claude 4 was recently updated in a way that breaks the use of “negatives” meaning that telling it “DO NOT-” do something, will make it likely to do that thing. While users can work around this with careful wording, it will require Cursor to change their hidden background prompt (or Anthropic to fix Claude4), or else the agent will continue to sandbag, gaslight, etc.

Topic		Replies	Views
What's going on with Claude 4 Sonnet? Discussions	6	844	June 4, 2025
Ridiculous excitement over new models when none of them work very well and sonnet has gone backwards too Discussions	8	338	February 15, 2025
Offer claude-3.7-sonnet as option (as claude-4-sonnet is worse for many tasks) Feature Requests	2	249	May 23, 2025
Claude 3.7 vs. 3.5 in Cursor - A step in the wrong direction? Discussions	6	5137	March 4, 2025
New Claude 3.5 already worse? Bug Reports	5	1359	October 28, 2024

Claude sonnet 4.0 Vs Claude sonnet 3.7

Related topics