Claude 3.7 vs. 3.5 in Cursor - A step in the wrong direction?

I’m curious to hear others’ experiences with Claude 3.7.
I’ve spent the past two days using it extensively, and I must say, I found it considerably less effective than 3.5. I’m wondering if this is a general issue or specific to my workflow.

The primary problem is its over-eagerness. It seems to intervene prematurely, attempting code modifications before the issue is even fully described. This often results in unnecessary and, frankly, incorrect code.

It appears to disregard linter warnings, which is a significant concern. The code it generates is often needlessly complex, over-engineering solutions where a simpler approach would suffice. It’s a concerning problem that when the model hyper-focuses on a particular issue, it tend to over-analyse it without taking more of the codebase into consideration.

The overall impression is one of an overconfident system that lacks a proper understanding of its context and the tools it’s employing. It makes questionable assumptions that lead to broken code. It makes it really difficult to work.

After a day of this, I reverted to 3.5, which is, at present, significantly more manageable. Controlling 3.7 proved to be an exercise in frustration.

Has anyone else encountered similar difficulties? Is there a recommended approach to using 3.7 that I might be overlooking? I’m open to suggestions, but at the moment, I’m finding it quite detrimental to my productivity.

5 Likes

Here is another example of what bothers me. 3.7’s repeated use of the word “likely”. It has so many tools available to check it’s assumptions, but it feels completely random when it does what. I didn’t ask an agent in yolo mode to hear what it “might” be - use your tools and check ffs.

That is also my experience.

One thing I’ve always liked about Claude Sonnet over, say, o3 mini, is Claude’s helpful go-getter attitude wanting to get to work, where my experience with o3 was that you had to cajole it just to get it to edit code.

But Sonnet 3.7, that go-getter attitude is now dialled up to eleven. It definitely often spins out of control doing way more than was asked of it.

I’ve put heavy wording in the Rules for AI to try to keep it in check, but it still has tendencies to spiral wildly.

1 Like

Exactly! I couldn’t agree more on the o3 experience. I have the $200 pro subscription but end up using Gemini (AI Studio) more, because I don’t have to molest it to provide code rather than a likely explanation to my issue.

1 Like

3.7 has created better code for me in general, but it definitely does do a few annoying things that 3.5 doesn’t do. For me the most annoying is more than half the time after every single command it’ll say

Let’s test our implementation by running the application:

And force me to cancel that command because it’s already running in the background. This is probably resolvable with cursor rules but it’s a bit annoying out of the box

1 Like

I find that you really have to keep an eye on 3.7 because it can easily spend five minutes going off in the wrong direction and adding code you never wanted. I’ve gone back to 3.5 for now and hope things can be improved.

1 Like

I’m having exactly the same problems with 3.7. It frequently ignores rules files. It frequently fails to follow explicit instructions. It frequently goes off on a tangent making changes I didn’t ask for.

When there is a linter error it doesn’t consider the root cause and instead adds incorrect and often needlessly complex hacks locally.

It also likes to do flat out stupid things to fix errors like hardcoding strings, using any in typescript, or editing generated files even though my cursor rules explicitly tell it to never do those things.

It consistently fails to remember what directory it’s in for terminal commands so it ends up creating files and folders all over the place then never cleans them up.

It seems to forget what problem it’s supposed to be solving as soon as it hits the first error in code it writes. Then it goes off the rails trying to “fix” the error without paying any attention to the rules, instructions, or original goal.

Overall it’s quite frustrating to use and seems like a step backward from 3.5.

2 Likes