I love Cursor, but this update has me scratching my head.
I feel like the combination of 0.46.x + Sonnet 3.7 yields a very weird agentic behavior that could be only described as overzealous execution.
Every revision I ask usually results in the agent trying to create code that is way beyond the scope of what I asked.
Also, it seems like the agent now aggressively uses tool calling, but in a way that degrades the entire exploration of the problem space and the execution.
Like, more often than not, I see the agent executing into super long horizons, almost like going rogue. It feels like it wants to go and build everything on its own by trial and error and never stop (perhaps I should turn off YOLO).
I have found that I now have to stop the response because the agent starts writing code or making modifications that break core functionality. It’s like it takes every request as a challenge to refactor the entire thing.
At this point, I don’t know how to adapt my workflow. I have tried reducing this behavior with rules with no success. I have also tried other models, but 3.7 looks promising, and I’m now cognitively biased to believe it is better than 3.5, although it might not be.
I’m starting to prefer o3-mini + agent, even though that combo feels so half-baked, but at least it is not as weird and destructive as the Sonnet 3.7 + agent combo.
I’ve noticed this as well with 3.7. It will go and do things I didn’t ask it to do. It will also ignore rules a lot unless I remind it. It forgets context and goes on its own tangents unrelated to my original request. Very annoying!
You have described my experience with Agent + Claude 3.7 almost word by word. I have put it in the rules not to try to call deployment commands in terminal and not to change styles and css when I haven’t asked it to. That does not seem to work. Noticed unwanted code modification only after couple commits and lost 40 minutes fixing. Was much more content with my previous slow and steady workflow with Composer. It also seems that the Agent mode creates times more traffic and both Claude and Gpt have become painfluly slow, less smart and responsive no mention being in the slow mode most of the time…
It’s hard to tell, but I’m also running into the issue with it being FAR too overzealous. Cursor 0.45.X with 3.5 Sonnet felt excellent in that I could give it clear directives of what I wanted it to do and it would do it. With Cursor 0.46.X and 3.7 Sonnet, no matter how clear I make my instructions it acts like it knows better than I do about what I want and generates a bunch of extra, unnecessary code. The amount of tool calls to read irrelevant files is also quite irritating especially when I “@” the relevant files in my prompts
It’s most definitely 1000% Cursor and anybody claiming otherwise is full of ■■■■. W1ndsurf is a night and day difference, zero issues with Sonnet. Claude.ai, zero issues. Claude Code: zero issues.