Cursor + Claude 3.5 Sonnet is doing very dumb things

risenowrise · November 11, 2024, 3:11pm

same here, feels like claude has brain damage for two days. it gets complete basic things wrong and does not stick to basic rules.

the haiku model btw gives an error and does not work at all.

o1 seems to be fine

shilomagen · November 12, 2024, 9:54am

+1 here

risenowrise · November 12, 2024, 12:03pm

its also not just sonnet. all the models are acting weird. it is unusable by my estimations. any response from the devs?

fun_strange · November 12, 2024, 12:42pm

There are likely two possibilities:

Not all context gets attached (there was a bug where the current file would not attach as context when switching editors or something similar). Try to ensure all the files are attached as context when sending the response. Also possibly try “Long Context Chat” if you have large files (as then some context may get excluded in Normal Chat).
It could be a perception bias. Claude worked well with some tasks, and we’ve gradually learned to offload more responsibility to it, hence it’s no longer able to perform well after we stopped doing most of the work we did at first, and our expectations have raised. This was the case for me.

o1-preview (or in my case I’m using o1-preview-128k in the Long Context Chat via Openrouter) works well almost always. After getting the taste of o1-preview and using it extensively for a month, I can no longer go back to Claude. Any benchmark claiming that Claude Sonnet or any other model is better than o1-preview, or any benchmark claiming that o1-mini is better than o1-preview is wrong. In actual real world scenarios o1-preview is currently the best released model for coding.

Sonnet may have a slight edge with newer APIs though, as its training data is slightly more up-to-date than that of o1-preview, but otherwise o1-preview is significantly better.

ericzakariasson · November 12, 2024, 3:21pm

Yeah, I see where @fun_strange is coming from. I’ve noticed similar patterns - sometimes it’s about missing context, and sometimes it’s that our expectations have grown as we’ve gotten better at working with these tools.

The Long Context Chat option is definitely worth trying if you’re running into issues. It helps make sure the LLM has all the information it needs to give you solid answers.

As for different models - they perform different at different tasks. Sometimes Claude works great, other times o1 might be the better choice.

If you notice things getting wonky:

Switch on Long Context Chat
Give different models a shot to see what works best
Start fresh conversations when responses start going off track
Double-check that all your files and context are properly attached

Dreams · November 12, 2024, 9:43pm

Why does Sonnet work fine when I use my own Anthropic API key or through Openrouter, but Cursor Pro is completely garbage? I swear, if this rerouting is still happening in the background, it’s a big scam from Cursor, just a waste of money at this point burning fast requests.

Try it yourself: Switch too your own API key, do the same prompt and context, and the results are way different!

Xcoderz · November 13, 2024, 12:17am

I noticed the claude web provides way better output vs cursor using sonnet3.5. Does using cursor with api key or openrouter work with chat and composer

Metaphysics0 · November 13, 2024, 6:39am

Do the tab autocomplete suggestions come from your own Anthropic API when when you enter the api key?

The main thing that has become absolutely horrible and unusable for me are the tab autocomplete suggestions.

leoing · November 13, 2024, 11:50am

+1

Context / file index seems broken.

deanrie · November 13, 2024, 11:59am

Hey, autocompletion works with our special model, and it’s included in the subscription. You’ll need a subscription to use it. Also, yesterday we announced that Supermaven is joining Cursor. If you’re interested, you can read about it here:

Topic		Replies	Views
New Claude 3.5 already worse? Bug Report	5	628	October 28, 2024
Cursor AI acting weird for anyone else? General	2	122	November 2, 2024
Claude has become completely unusable Bug Report	7	1116	August 30, 2024
Model fallback? Bug Report unsolved	6	70	October 25, 2024
How can i force not using GPT-4o as fallback General	2	55	November 5, 2024

Cursor + Claude 3.5 Sonnet is doing very dumb things

Related topics