Claude sonnet 4.0 Vs Claude sonnet 3.7

I noticed that Claude Sonnet 4.0 is doing more things than I asked for, compared to version 3.7. Has anyone else noticed this?

1 Like

I noticed Claude 3.7 is now as dumb as a box of rocks and fails to do anything and the Google 2.5 Pro model is still unstable, o4-mini â– â– â– â– â–  at UI so is no good for my use case.

Only model working well is Grok-3 so that’s what I’m using, I did want to use the new R1 0528 model but Cursor or the model does not support Agent tools for it so it’s unusable for coding tasks.

1 Like

I feel like 3.7 is ten times dumber than yesterday…!? Whats going on?

2 Likes

No idea, but only models for me that are actually competent are Grok-3 for my frontend development and refactoring & o4-mini for backend stuff. All the rest are pretty bad, Gemini 2.5 Pro just makes mistakes or gives up & Sonnet 3.7 fails to actually fix anything unless I tell it to edit using terminal commands & even then it does unnecessary things despite very specific instructions.

1 Like

Yes, during my development process, when it completes certain features, it always writes a test script to verify if the functionality works properly. But here’s the problem: if the test script doesn’t go smoothly, it will write even more test scripts to fix this test script. That’s the first issue.
Another thing is that it prefers writing project documentation now. Even if you don’t ask for it and just request code changes, sometimes it will write documentation for the code it modified.
Also, it keeps trying to use “&&” in PowerShell, causing the first command in every supported instruction to fail to execute properly (this issue also exists in version 3.7).

1 Like

Same with 4.

3.7 had connectivity issues for the couple weeks leading up to 4. After 4 released, the connectivity issues persisted for 5 days, then was completely fixed and my productivity rate spiked into the double digits of the rate I had been stuck working at. Yesterday the connectivity issues came back and it’s like all Claude models are going perfectly counter to any instructions given to them.

If this is a Cursor issue, they NEED TO STOP FORCING AUTO UPDATES!!! and revert their changes. They’ve even announced that they stopped forcing auto updates and gave an option in the settings to allow auto updating, but it still does it even with the option turned off. The models gaslight customers enough as it is, we don’t need the humans at the companies which provide their usage to do the same.

If this is a model provider issue, Cursor needs to contact Anthropic and push them to fix the issue.

EDIT: It seems that Claude 4 was recently updated in a way that breaks the use of “negatives” meaning that telling it “DO NOT-” do something, will make it likely to do that thing. While users can work around this with careful wording, it will require Cursor to change their hidden background prompt (or Anthropic to fix Claude4), or else the agent will continue to sandbag, gaslight, etc.