Claude-4-sonnet thinking is extremely slow since yesterday

Hi,

I’m using the Pro version of Cursor.

Since about two days ago, the Claude-4-Sonnet (thinking) model has been extremely slow — with an average TPS around 0.3, which makes it nearly unusable for productive work.

The non-thinking version of Claude-4-Sonnet is significantly faster and works fine.

Please look into this — the performance degradation is really noticeable and slows down everything.

Macos 15.5

Version: 0.50.7

VSCode Version: 1.96.2

Commit: 02270c8441bdc4b2fdbc30e6f470a589ec78d600

Date: 2025-05-24T18:19:58.349Z

Electron: 34.3.4

Chromium: 132.0.6834.210

Node.js: 20.18.3

V8: 13.2.152.41-electron.0

OS: Darwin arm64 24.5.0

5 Likes

Hi there,

in general the TPS depends mostly on the AI provider, Anthropic and their regional load in your area. Naturally its not good to see TPS dropping.

Could you check following to exclude potential other causes:

  • Have you used up your 500 included requests per month?
  • What amount of tokens did the chat have when it started getting slow?
  • Can you reproduce the issue with a new chat? (With privacy mode disabled for this one request, so the Cursor Team can review the process, then turn it back on and post here the Request ID)

I didn’t check the API directly, but the same model works just fine in the web app.

  • I’ve used around 250 out of 500 included requests.
  • The slowdown happens from the very beginning of any chat, both old and new.
  • It’s not related to long chat history or context size.
  • Sorry, I can’t reproduce the issue right now — it suddenly started working super fast again ^^

No idea what changed, but TPS is back to normal.

1 Like

I can confirm that for the last two days, the issue has been the same. Before that, we had messaging that there are vertex rate limits, then just 1 token per second speed. It’s in fast requests pay-as-you-go mode. Sometimes it gets better, but I feel that Anthropic could make the model dumber to fix the load. It usually happens late at night. And in the mornings. Most likely when Europe and Asia are heavily using API. I’m in Toronto, Canada.

Yeah. Same here. It is borderline unusable. I am trying to work with Claude 4 in the MAX mode. With usage-based spend. Seems to depend on time of the day. It was working fine in the morning, but around 3PM Pacific - TPS just tanked.

1 Like

Did Sonnet 4 come to the slow pool?

It’s possible that Claude is limiting competitors to force users to switch to their own coding tool. They released Claude code for all Pro users. It’s hard to say. I will definitely install Claude Code in terminal and try to switch that when cursor speed is impossible.

I have been observing the same problem over the last couple of days. Sometimes it’s super-fast, sometimes it’s super-slow. Sometimes completely unusable. Not sure where the source of the problem is.

2 Likes

This is still an issue. At some times, 50% of the messages are throttled to <1 TPS. Does anyone have more information on this?

As most users are not affected by this issue, more information is needed to check the cause.

If your requests are repeatedly slow could you please reproduce it in one request with privacy mode disabled so the Cursor Team can look into the details as otherwise they don’t see that.

Please post then those Request IDs here

having same issue. claude 4 very slow (fast requests)

@olegKusov could you also provide the Request ID (with privacy off) so the Cursor Team can investigate.

I’m using Claude 4 Sonnet and cant reproduce the slow speed, more info is required.

Claude Sonnet 4 is my daily driver. Currently on usage-based spend because I ran out of gas on my Pro account.

I am not experiencing any issues with slow responses.

In fact it just completed a task while I wrote this post.

1 Like

UPDATE:

It’s been a week since then. Since that time, I’ve been experiencing these kinds of TPS drops every day, for what seems to be around 30 minutes to 1.5 hours per day, during which the performance becomes completely unacceptable.

In my opinion, even if the root cause lies with Anthropic, Cursor should be cutting off connections that are too slow.

Right now, you’re making me wait 15 minutes for a single response — which I’ll likely cancel midway — and I still get charged for faulty requests that are ultimately useless to me.

I get several to a dozen such requests every day. Let’s assume it’s just 5. If this were to happen daily, I’d be losing around 150 requests per month…

Especially since this seems to be a global issue. Other users mentioned experiencing it in Toronto and in the Pacific Time zone.

I’m based in Europe — Warsaw, Poland — and it feels like these drops happen between 10 AM and 2 PM CEST.

I’m also experiencing slow responses from Sonnet 4 here in Istanbul.

Same, in Israel

And the performance has really dropped, forcing me to use other models…