Extremely slow token generation with GPT-5.5 high, especially while model is writing patches

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

I’m using GPT-5.5 High in Cursor Version: 3.8.11 (Universal). Since sometime yesterday, the model started taking an extremely, extremely long amount of time to generate tokens in some contexts. For example, I observed a patch section being generated at less than one token per second - literally, waiting a few seconds, seeing a token come through, waiting a few more seconds, seeing another token come through.

I would estimate that it’s a 100-times slowdown versus normal token generation. This is making Cursor unusable.

Interestingly, thinking tokens sometimes seem to come through at full speed.

Steps to Reproduce

Choose gpt-5.5 High in Cursor, ask it to work on a decent sized codebase, observe extremely slow token generation during code writing.

Expected Behavior

I expect tokens to be generated at the typical rate.

Operating System

MacOS

Version Information

Version: 3.8.11 (Universal)
VS Code Extension API: 1.105.1
Commit: e56ad3440df06d22ca7501e65fd518e905486ef0
Date: 2026-06-18T01:40:18.333Z
Layout: editor
Build Type: Stable
Release Track: Default
Electron: 40.10.3
Chromium: 144.0.7559.236
Node.js: 24.15.0
V8: 14.4.258.32-electron.0
xterm.js: 6.1.0-beta.256
OS: Darwin arm64 24.5.0

For AI issues: which model did you use?

GPT-5.5 High

Does this stop you from using Cursor

Yes - Cursor is unusable

Its because of the high demand of codex ( openai made a reset of everybodys usage )

Try composer ( cursor has problably the most computing power after the merge with spacex )

The pattern you’re describing - reasoning streaming at normal speed but the code/patch output crawling to ~1 token/sec, and appearing suddenly - usually points to a network or connection issue between your machine and our servers rather than the model itself. This doesn’t appear to be a widespread issue on our end right now, which fits that.

A few things to try, in order:

  1. Run Network Diagnostics: Cursor Settings → Network → Run Diagnostics. This tests your connection to our servers and flags anything affecting streaming. (Network troubleshooting)
  2. If you’re behind a corporate proxy or VPN: in the same Network settings, set HTTP Compatibility Mode to HTTP/1.1 and restart Cursor. Some proxies throttle or buffer HTTP/2 streaming, which slows token-by-token output specifically.
  3. Isolate the network: try a different connection for a few minutes (e.g., a phone hotspot). If the slowdown disappears, that confirms the bottleneck is the network path.

If it’s still slow after that, grab a Request ID from one of the slow chats (open the chat → “…” menu → Copy Request ID) and paste it here so we can trace the timing on our side. Connectivity issues are debuggable even with Privacy Mode on. (Finding a request ID)

GPT-5.5 also has a higher-throughput “Fast” option in the model picker (at a higher per-token cost), but if the root cause is the network path that won’t make a difference - so I’d start with the diagnostics above.