Intermittent Slow Token Streaming with Sonnet and Gemini Pro

:lady_beetle: Bug Description

As a Pro User, when using Sonnet and occasionally Gemini Pro, responses frequently become extremely slow — returning only 3–5 tokens per second. This severely impacts usability.

:counterclockwise_arrows_button: Steps to Reproduce

  1. Use Sonnet (or Gemini Pro) in Cursor for standard prompts or tool-assisted flows.
  2. Observe that roughly every 3rd-5th request responds extremely slowly.
  3. Canceling and retrying the same request usually, yet not always, results in normal speed.
  4. While the first response can fast, subsequent processing, e.g. after tool calls can alternate between fast and slow speeds.

:camera: Screenshots/Recordings

N/A — can provide upon request.

:laptop: Environment

  • OS: macOS Sequoia 15.5
  • Chip: Apple M3 Max
  • Memory: 64 GB
  • Cursor Version: Version: 1.0.0 (Universal)
    VSCode Version: 1.96.2
    Commit: 53b99ce608cba35127ae3a050c1738a959750860
  • Model(s) affected: Sonnet, Gemini Pro

:prohibited: Impact

Yes — this issue makes Cursor effectively unusable during affected sessions and breaks flow during interactive workflows.