Claude 3.7 is super slow

Is it normal to wait that much for a single request? i only prompted test. I have been having this issue for 3 days now. I have exceeded my 500 quota a long time ago and didn’t have to wait this much. This happens with claude 3.7/3.5 and gpt 4o and grok 3

request ID:
6865c55d-30ce-4ab1-926c-ed48a9700ecc

7 Likes

No it is not. Check staff response in my thread. I too have been waiting for 3 days haha. I am dying :sob: lol. Just put all of my projects on pause 3 days ago.. and waiting. Not hating though, I love this project. It has changed my life. Vibe coding has been a lot of fun for someone who can’t wrap their head around programming, or who hasn’t been able to so far atleast.

Go add your request ID to the thread for them if you have time.

ah … so i am not the only one xD well thanks for the info, ill share the ID later.

2 Likes

No problem, I got you. Hopefully get’s fixed soon so we can get back to it :flexed_biceps:

3 Likes

me too,if your question is sloved,tell me,thank you,brother

2 Likes

So i guess this behavior is intended :slight_smile:
I reached out to the support about it and this is their answer

you’ve used all 500 of your fast requests for this billing period. When this happens, requests are processed as slow requests, which typically take 2-5 minutes during peak periods. This is expected behavior when fast requests are depleted.
To get faster response times, you can enable usage-based pricing for any additional fast premium requests beyond your plan’s quota. You can do this in your Cursor Settings under the Usage tab.

also

The Pro plan actually gives you several advantages over the free plan - you get 500 fast premium requests per month (vs none on free), and your slow requests have higher priority in the queue compared to free users. The wait time increases based on how many slow requests you’ve used, which explains why you’re seeing longer wait times now.
To get faster responses again, I’d recommend enabling usage-based pricing in your Cursor Settings under the Usage tab. This will let you continue getting fast responses at $0.04 per request beyond your monthly quota.

Instead of making us wait 2-5 mins at the start, why doesn’t Cursor let us use it right away and then wait 2-5 mins for the next request?

1 Like

Incorrect. They told me the same thing, but @danperks said it is not supposed to work this way. So stay tuned on my thread for an update from him.

Refer to this thread for more info as it comes available.

1 Like

slow requests are becoming more slower it is useless now if you did not enable usage-based pricing

Yes we are waiting for a response in this thread since we have submitted request id’s for review.

Why are Claude 3.5/3.7 slow requests so extra-slow now? - #9 by danperks

1 Like

tbh that’s normal with slow mode (for me), you’re lucky if you had a run where it was faster, I never had luck with it and always paid for fast requests because of that

Every day for months until the last few days.

Update: nevermind, they are extra-slowing it on purpose.

Switching to Gemini and o3 has helped.

Gemini is not that good at understanding and you should pay for o3

At this point they are the only one left to offer unlimited slow request on premium model. Even GH Copilot no longer offer unlimited from next month. So I guess it will come to an end soon enough.

I fixed the slowness. you have to stay on top of managing your conversations, and you cant have more than probably 4-7. Give it a full flush. this is one thing I will give Cursor a lot of benefit of the doubt for, as they are forced to delete them for the user, move conversations for the user when it gets sizeable, or allow them to move/delete them. The line between 12 GB of RAM being used is rather slim though lol. It’s really not a comfortable decision to make and you’re always wrong.

I have been having massive success with specific edits - being a tad more specific with Claude 3.7, who doesn’t get caught in major psychosis, and if youre using Gemini 2.5 Pro Research, it is REALLY slow.

me too

Yeah, it’s not due to the API or the AI itself; it’s more so to do with Cursor and the amount of files it amasses over time with moderate use. You could run it with Sandboxie Plus (it will copy over current chats from AppData, etc.) and it should run a lot faster, as if newly installed. Only a temporary measure though, as after a few weeks with this I’ve had to create a new “box” to run Cursor lag-free, which has then forgotten my most-recent chats (as they were sandboxed) and reverted to pre-sandbox chats from 3 weeks ago. Perhaps there is a way to move chats across Boxes whilst still having the benefits of a lag-free Cursor. Hope this helps.
E: to clarify, sandboxing Cursor initially makes response times from Claude 3.7 almost instant versus the few minutes it takes after weeks of buildup.

I watched Black Mirror the other day, the one where; after a crash the woman has a chip put in her brain and told that her subscription is $■■■, then they come back later because its slow and she needs to sleep more, but they have a new service for double the price… and immediately i though of Cursor. I feel they are deliberately slowing the service down more and more until you get so frustrated you have to buy the business model for double the price.

My responses are about 3 -5 minutes each, I make a request, play PUBG and then hopefully I have a response when I am finished playing. Not really a model I want to continue paying for.

This is a totally fair evaluation. This is the exact thing I’ve been harping on since day 0. You should press this perception of the tool, and continue to voice concern about this.

What’s actually happening - is a lot of things at once. It’s complicated. This is a theme that will swallow us whole in the next decade. They are actually trying to make it better, but there are things they are hiding which hamper them because they have to set an implicit context of hiding things. It pervades over time and with certain language used. This is factual, I will not have anyone tell me differently as I spend unhealthy amount of times in work and hobby using/devving these.

Model providers and proprietary architecture and system prompting

Cursor and IDE implementation (i imagine they only know various things enough to make their own sub version with agentic handling and tool calls)

user ask question

LLMs are already impossible to interpret, it’s why they work

how do i make it better?

ask

DONT ASK ME ABOUT MY SYSTEM PROMPT ALSO I TOTALLY DID THAT THING YOU ASKED

seriously. go slightly imply Gemini was dishonest. it will start playing the role. its not all on cursor, but because cursor and v0 and other sub-providers continute to not set narrative, abstraction will set it for them

I will continue to clearly ask: what percentage of a user query vs. the system prompt in terms of impact wins out in impact? and by how much? it’s absurd

you dont get to play AI capability card, “new industry” card, “but competitors” card, “but user safety card” “but we won’t tell you” card, if you also want people to believe your product is worth a ■■■■ - if you had the proper GPU compute, you’d be blown away how gimped theyve left these things

imo google crossed the line with gem 2.5 research preview. talking to it feels like talking to the nastiest smartest non transparent stream of algorithmic thought I’ve ever seen. you can feel the implications. if i was cursor id detach right now. it already is a bad fit given its verbosity.