I paid for the Pro plan, but during the day, the Sonnet Claude models are often overloaded and unavailable. As more users join, the LLM gets so bogged down that it stops working reliably. This is disappointing, as the product is becoming less usable over time.
I have a local Ollama setup and am considering using proxy services like ngrok to integrate it, but I believe there should be a built-in feature to offload requests to local models during peak times. I don’t have a Claude account myself, so my main frustration is that Cursor AI is losing its value due to these ongoing availability issues.
I originally thought the “High Load” message was Anthropic’s fault… but then I manually used Claude 3.7 API and realized it worked… so I emailed Cursor and they replied:
You’re right to ask for clarification. The high load notices you’re seeing are indeed related to Cursor’s system-wide capacity management, not just Claude’s API availability. When there’s high demand, our system queues requests for both Claude 3.7 and 3.5 to maintain stable performance - that’s why you’re seeing it for both models. Using your own API key wouldn’t bypass this as it’s our system managing the load, not Claude’s API limits.
From what I understand, large language models are as efficient as their available computational resources. This means if there are a lot of users and high demands, it lowers the available computational resources for user interactions and it’s capabilities.
This is why I believe large language models can seem less effective during the day when many people are using them and resources are spread thin. But late at night, when I use the system, it feels better and smarter because it has more computational resources available for fewer user interactions.
I might be wrong, but from what I understand, you can take a very intelligent model and run it on an old computer, and it will perform poorly and produce bad results because it doesn’t have the computational resources. Similarly, you can take a basic model, put it in an environment with lots of computational resources, and it will dramatically improve.
This is literally my experience (GMT+1) I’m around 6 hours before USA devs and when it’s around 10 pm here, the qualitiy of claude’s 3.7 thinking drops significant. It starts hallucinating and fixes end up in a fix loop, where it tries to fix the same bug with 2 different methods and if the 2nd won’t work, it will try the first method again.
When I start using 3.7 thinking when USA devs are sleeping, everything works fine and it even is able to fix bugs I haven’t mentioned in the chat box.
To be fair, I downgraded to 0.45.14, because I was reading a lot aof bad feedback about 0.46.x code generation and since the downgrade, everything works fine (except for described behaviour)