Ridiculous excitement over new models when none of them work very well and sonnet has gone backwards too

WDRBowman · February 6, 2025, 12:09am

The new cursor updates have taken the product backwards. Too much focus on adding models at the expense of the product and new versions aren’t using sonnet very well either. Why are we now getting so many short responses and questions like ‘Would you like me to focus on fixing this specific issue?’ - these seem designed to use up fast requests if you ask me. I’m losing patience with this product. It’s costing too much not just in money but the output is dreadful at times, theres no safety in place (it happily deletes critical business logic) and even sonnet has got worse. I think they should stop adding models and sort out the core product offer otherwise its just not worth it.

rant over.

troehrkasse · February 6, 2025, 4:23pm

That’s not my experience at all, I’d say everything is working better than ever (only use claude-3.5-sonnet). Maybe the way you are prompting is the issue?

rumor · February 6, 2025, 5:42pm

there are many moving parts in processing your prompt, your local cursorrules, the used model, the size of your project, and the scope/bounds of your request to the model (@codebase vs ‘fix one specific file’). For me it is a never ending process of tweaking them, choosing different parts for different prompts. So yeh, it is not easy but now I don’t have these ‘Would you like me to focus on fixing this specific issue?’ questions.

isarmstrong · February 6, 2025, 7:02pm

ChatGPT-o3 so far is almost hirable. Like all LLMs you have to bird-dog it on context and strategy, and it’s a lot slower than Claude, but it’s also doing vastly higher quality work.

The only issues with o3 are a low bias for action (it likes to over-confirm) and a tendency to say “okay done!” when it’s not… but that’s when you ask Claude if it’s finished yet then stop claude before it can put its dirty paws in the pastry.

WDRBowman · February 6, 2025, 7:27pm

Agree to some extent. I’m not a fan of cursor rules. If I scan through a code base I can see the stack, patterns and packages. I think that ability seems to be going backwards with cursor and sonnet

WDRBowman · February 6, 2025, 7:32pm

“Almost hireable” made me chuckle. Very true. Agree with the flipping between models. I think the key is that some models have seen a problem more often than others so the training data is stronger but this is usually on easier problems. The issue is when you’re working on things that models won’t have been trained on. That’s where the enterprise cost benefit use case is less strong and comes back to the “almost hireable”. Thanks for your balanced response - more balanced than my middle of the night stressed out devs rant.

WDRBowman · February 6, 2025, 7:35pm

My prompting is ■■■■■■ perfect thank you

isarmstrong · February 6, 2025, 7:54pm

Enterprise doesn’t use LLMs unless they’re Microsoft partners, lol. Or maybe I’m biased as ex-Dell and knowing how many light years behind my little startup agency they are in terms of tooling.

danperks · February 15, 2025, 4:04pm

We’ve actually only added DeepSeek recently, specifically because it offers great performance for the cost! Besides that, we don’t often add new models except when they are exception to what we already offer.

The short responses and confirmation questions are pretty ‘out-the-box’ for Cursor, but our various “rules” features are meant exactly for this purpose - you can instruct Cursor to not ask follow up questions and to make more assumptions, or to plan out its actions more before executing them - whatever you want!

If you have specific and clear examples of where Claude is not working as well as you think it should, do send them over!

Topic		Replies	Views
Model fallback? Bug Reports	6	149	October 25, 2024
What's Wrong with Cursor Lately Feedback	31	6779	March 13, 2025
This is getting out of hand Feedback	12	887	August 8, 2025
People, Your Honest Opinion Discussions	23	2658	March 18, 2025
Sonnet's Reign is Over, We need smarter MODELS Discussions	1	512	January 27, 2025

Ridiculous excitement over new models when none of them work very well and sonnet has gone backwards too

Related topics