Cursor + Claude 3.5 Sonnet is doing very dumb things

same here, feels like claude has brain damage for two days. it gets complete basic things wrong and does not stick to basic rules.

the haiku model btw gives an error and does not work at all.

o1 seems to be fine

1 Like

+1 here

its also not just sonnet. all the models are acting weird. it is unusable by my estimations. any response from the devs?

There are likely two possibilities:

  1. Not all context gets attached (there was a bug where the current file would not attach as context when switching editors or something similar). Try to ensure all the files are attached as context when sending the response. Also possibly try “Long Context Chat” if you have large files (as then some context may get excluded in Normal Chat).

  2. It could be a perception bias. Claude worked well with some tasks, and we’ve gradually learned to offload more responsibility to it, hence it’s no longer able to perform well after we stopped doing most of the work we did at first, and our expectations have raised. This was the case for me.


o1-preview (or in my case I’m using o1-preview-128k in the Long Context Chat via Openrouter) works well almost always. After getting the taste of o1-preview and using it extensively for a month, I can no longer go back to Claude. Any benchmark claiming that Claude Sonnet or any other model is better than o1-preview, or any benchmark claiming that o1-mini is better than o1-preview is wrong. In actual real world scenarios o1-preview is currently the best released model for coding.

Sonnet may have a slight edge with newer APIs though, as its training data is slightly more up-to-date than that of o1-preview, but otherwise o1-preview is significantly better.

Yeah, I see where @fun_strange is coming from. I’ve noticed similar patterns - sometimes it’s about missing context, and sometimes it’s that our expectations have grown as we’ve gotten better at working with these tools.

The Long Context Chat option is definitely worth trying if you’re running into issues. It helps make sure the LLM has all the information it needs to give you solid answers.

As for different models - they perform different at different tasks. Sometimes Claude works great, other times o1 might be the better choice.

If you notice things getting wonky:

  • Switch on Long Context Chat
  • Give different models a shot to see what works best
  • Start fresh conversations when responses start going off track
  • Double-check that all your files and context are properly attached

Why does Sonnet work fine when I use my own Anthropic API key or through Openrouter, but Cursor Pro is completely garbage? I swear, if this rerouting is still happening in the background, it’s a big scam from Cursor, just a waste of money at this point burning fast requests.

Try it yourself: Switch too your own API key, do the same prompt and context, and the results are way different!

4 Likes

I noticed the claude web provides way better output vs cursor using sonnet3.5. Does using cursor with api key or openrouter work with chat and composer

Do the tab autocomplete suggestions come from your own Anthropic API when when you enter the api key?

The main thing that has become absolutely horrible and unusable for me are the tab autocomplete suggestions.

+1

Context / file index seems broken.

Hey, autocompletion works with our special model, and it’s included in the subscription. You’ll need a subscription to use it. Also, yesterday we announced that Supermaven is joining Cursor. If you’re interested, you can read about it here:

1 Like

Yes, i also experiencing these issues, after hit fast responses now im only use slow request.

But, the slow request i always use claude 3.5 sonnet, not gpt-4o-mini or cursor-small.

Maybe this caused the problem?

Hey, fast and slow requests are both executed the exact same as each other, the only difference being you may have to queue for slow requests to complete.

We are having issues with Anthropic (see here: Slow Pool Information) but this shouldn’t be causing you any issues in response quality - feel free to try the other models to see if you get better results!

This is a huge problem. Its actually doing the dumbest stuff Ive ever seen in the last week. Must be falling back to a super low performance model because it completely broke my website. 70% of the work gone. Have to start over. Ive tried everything, context, work docs, pre-boiler plate instructions and it simply ignores EVERYTHING. It doesn’t even reflect on the past 30 minutes of work. Whatever you guys did on the backend, please restore to when its was actually good!!!

Im at a point where Im about to cancel the sub and wait for a better agentic tool to surface. Literally poured 100+ hours trying to fix these issues.

I asked it why its doing this and it pretty much admitted it cant follow the basic instructions/context/documentation. Its cooked.

It’s February 5, 2025, and I’m feeling something’s not right. My first 500 premium requests were used up in the first week, and my boss had upgraded us to the Business plan initially, I was expecting at least 1000 requests since we’re paying double. But now, slow requests are frustrating me. Claude 3.5 Sonnet used to save me time, and I thought I’d never need to code again. But after using up my first 500 requests, it feels like it’s broken, even with extra context md file and reduced session files, it’s still dumb. Despite my efforts, the slow requests are still causing issues. I’m questioning if it’s just me or if something’s off with Cursor.

@well-this-■■■■■ I’ve been having similar issues. I work in conjunction with Git so that I don’t have to fall too far back if it really screws up. But the last two weeks I have seen very wacky behavior compared to before. Besides the sorts of things you are seeing (“You have hit on something fundamental”), it started “straying” far from the area of focus, changing code in other modules, creating inconsistencies where I had worked hard for consistency, to the point of madness. This is all in the last ~2 weeks or so- approximately the release of 45.8 (I am now on 45.9- even though I could not find the download, it did update to that later yesterday).

I don’t know what is going on but it’s caused me to question whether its worth it or not. I had to resort to some very strong Cursor Rules to restrict its scope to exactly 1 thing, which, more of less, seems to be holding. What was interesting was that after an exasperated “How the f*** can I stop you from straying!” was that Cursor both apologized and also suggested rules to constrain it, complete with markup:

1. <change_format>Before making ANY code changes, I must: 
    1) List EXACTLY what changes I plan to make in a numbered list 
    2) Wait for explicit user approval 
    3) Only proceed with those exact changes after approval 
    4) List any additional suggestions separately and ask if user wants to hear them</change_format>
2. <no_bundled_changes>Never bundle unrequested changes with requested ones. Each change must be explicitly approved by the user.</no_bundled_changes>
3. <ask_first>If I see potential improvements beyond the user's specific request, I must ask permission before even suggesting them.</ask_first>
4. <minimal_changes>When making code changes, I must:
    1) First identify the SMALLEST possible change that would fix the issue
    2) Propose ONLY that minimal change
    3) Never bundle multiple changes unless explicitly requested
    4) If I think more changes might be needed, I must ask separately AFTER the minimal fix is confirmed working</minimal_changes>
5. <show_work>Before proposing ANY code change, I must:
Explicitly state which exact line is causing the problem
Explain why that specific line is problematic
Show the exact before/after of ONLY that line</show_work>
6. <stop_and_verify>After each change, I must:
Verify that ONLY the intended line was modified
Confirm the change matches what was discussed
Ask if you want to proceed with any additional changes</stop_and_verify>
1 Like