Who is the over-zealous actor: Cursor 0.46.X or Sonnet 3.7 Review or both?

whoisjuan · March 4, 2025, 3:24am

I love Cursor, but this update has me scratching my head.

I feel like the combination of 0.46.x + Sonnet 3.7 yields a very weird agentic behavior that could be only described as overzealous execution.

Every revision I ask usually results in the agent trying to create code that is way beyond the scope of what I asked.

Also, it seems like the agent now aggressively uses tool calling, but in a way that degrades the entire exploration of the problem space and the execution.

Like, more often than not, I see the agent executing into super long horizons, almost like going rogue. It feels like it wants to go and build everything on its own by trial and error and never stop (perhaps I should turn off YOLO).

I have found that I now have to stop the response because the agent starts writing code or making modifications that break core functionality. It’s like it takes every request as a challenge to refactor the entire thing.

At this point, I don’t know how to adapt my workflow. I have tried reducing this behavior with rules with no success. I have also tried other models, but 3.7 looks promising, and I’m now cognitively biased to believe it is better than 3.5, although it might not be.

I’m starting to prefer o3-mini + agent, even though that combo feels so half-baked, but at least it is not as weird and destructive as the Sonnet 3.7 + agent combo.

imagio · March 4, 2025, 2:07pm

I’ve noticed this as well with 3.7. It will go and do things I didn’t ask it to do. It will also ignore rules a lot unless I remind it. It forgets context and goes on its own tangents unrelated to my original request. Very annoying!

T1000 · March 4, 2025, 2:11pm

Sonnets 3.7 attention span is very short. specially in thinking mode.

Eriksmiks · March 4, 2025, 4:32pm

You have described my experience with Agent + Claude 3.7 almost word by word. I have put it in the rules not to try to call deployment commands in terminal and not to change styles and css when I haven’t asked it to. That does not seem to work. Noticed unwanted code modification only after couple commits and lost 40 minutes fixing. Was much more content with my previous slow and steady workflow with Composer. It also seems that the Agent mode creates times more traffic and both Claude and Gpt have become painfluly slow, less smart and responsive no mention being in the slow mode most of the time…

camm73 · March 4, 2025, 8:35pm

It’s hard to tell, but I’m also running into the issue with it being FAR too overzealous. Cursor 0.45.X with 3.5 Sonnet felt excellent in that I could give it clear directives of what I wanted it to do and it would do it. With Cursor 0.46.X and 3.7 Sonnet, no matter how clear I make my instructions it acts like it knows better than I do about what I want and generates a bunch of extra, unnecessary code. The amount of tool calls to read irrelevant files is also quite irritating especially when I “@” the relevant files in my prompts

dingausmwald2 · March 4, 2025, 9:24pm

this was already the case with .43 and claude 3.5
like the last 6 months i am in. Nothin new here

MonkeyCrumbs · March 4, 2025, 11:01pm

It’s most definitely 1000% Cursor and anybody claiming otherwise is full of ■■■■. W1ndsurf is a night and day difference, zero issues with Sonnet. Claude.ai, zero issues. Claude Code: zero issues.

system · April 3, 2025, 11:01pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Feedback: I am really excited using Agent mode + Sonnet 3.7 Discussions	7	369	April 4, 2025
People, Your Honest Opinion Discussions	23	2561	March 18, 2025
Claude 3.7 is bad in cursor agent Discussions	4	756	March 2, 2025
What agent is working best for you? Discussions	1	144	February 28, 2025
🚀 O3 Update incoming Discussions	27	4312	April 20, 2025

Who is the over-zealous actor: Cursor 0.46.X or Sonnet 3.7 Review or both?

Related topics