Cursor fails to parse tokens usage for most 3rd party APIs

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

When you use OpenAI override and add 3rd party models, cursor stops tracking token usage for some reason, despite providers returning correctly formated token usage. Sometimes it does display some data in dashboard(highlighted on the screen), but it’s incorrect, and I don’t see these numbers anywhere in api calls. Problem seems to not be connected to tool calls (still hapens with 0 tool calls), mode (agent/ask/plan), or provider usage return style (zai returns in the same chunk as finish_reason, minimax returns as a separate chunk).

Steps to Reproduce

  1. Add minimax/glm models via coding plan
  2. Do a requests
  3. See “0” for token usage in dashboard.

Expected Behavior

Correct numbers of tokens in dashboard.

Screenshots / Screen Recordings

Operating System

Windows 10/11

Current Cursor Version (Menu → About Cursor → Copy)

Version: 2.3.29 (user setup)
VSCode Version: 1.105.1
Commit: 4ca9b38c6c97d4243bf0c61e51426667cb964bd0
Date: 2026-01-08T00:34:49.798Z
Electron: 37.7.0
Chromium: 138.0.7204.251
Node.js: 22.20.0
V8: 13.8.258.32-electron.0
OS: Windows_NT x64 10.0.26200

Does this stop you from using Cursor

No - Cursor works, but with this issue

Hey, thanks for the report.

To investigate this bug, we’ll need a bit more info:

  1. Request IDs: Pick a few requests with zero usage in the dashboard, open the chat context menu (three dots in the top right) > Copy Request ID. Send 2 to 3 of those IDs.

  2. Console logs: Open Developer Console (Help > Toggle Developer Tools), run a request to minimax/glm, then copy the logs from the console.

This will help the team understand how providers format token usage and fix the parsing.

I run small proxy to intercept requests to the api and force correct temperature/reparse thinking for GLM, but the problem persists with and without the proxy, and for Minimax I do not change anything in upstream response, so that should not matter.
GLM:
abcc039f-ce29-4f46-a968-0d936fd408d5
Example of format return

-- PARSED STREAM CHUNK 126 len=611 --
data: {"id": "20260108210546e4b5a3ee404646de", "created": 1767877546, "object": "chat.completion.chunk", "model": "glm-4.7", "choices": [{"index": 0, "delta": {"role": "assistant", "content": " if"}}]}

data: {"id": "20260108210546e4b5a3ee404646de", "created": 1767877546, "object": "chat.completion.chunk", "model": "glm-4.7", "choices": [{"index": 0, "delta": {"role": "assistant", "content": " needed"}}]}

data: {"id": "20260108210546e4b5a3ee404646de", "created": 1767877546, "object": "chat.completion.chunk", "model": "glm-4.7", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "."}}]}

-- END PARSED STREAM CHUNK --
-- PARSED STREAM CHUNK 127 len=430 --
data: {"id": "20260108210546e4b5a3ee404646de", "created": 1767877546, "object": "chat.completion.chunk", "model": "glm-4.7", "choices": [{"index": 0, "finish_reason": "stop", "delta": {"role": "assistant", "content": ""}}], "usage": {"prompt_tokens": 48230, "completion_tokens": 916, "total_tokens": 49146, "prompt_tokens_details": {"cached_tokens": 40184}, "completion_tokens_details": {"reasoning_tokens": 439}}}

data: [DONE]

-- END PARSED STREAM CHUNK --

Minimax:
7044ea63-2968-4240-8276-e873c1007e72

-- STREAM CHUNK 25 len=505 --
data: {"id":"05aedf5a5e50f5c32c66d34663da7ccc","choices":[{"finish_reason":"stop","index":0,"delta":{"content":")\n- Both systems operate independently - the database logging is async and won't fail if database is unavailable","role":"assistant","name":"MiniMax AI","audio_content":""}}],"created":1767877722,"model":"MiniMax-M2.1","object":"chat.completion.chunk","usage":null,"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"output_sensitive_int":0}

-- END STREAM CHUNK --
-- STREAM CHUNK 26 len=384 --
data: {"id":"05aedf5a5e50f5c32c66d34663da7ccc","choices":[],"created":1767877722,"model":"MiniMax-M2.1","object":"chat.completion.chunk","usage":{"total_tokens":34048,"total_characters":0,"prompt_tokens":33311,"completion_tokens":737,"completion_tokens_details":{"reasoning_tokens":298},"prompt_tokens_details":{"cached_tokens":30997}},"base_resp":{"status_code":0,"status_msg":""}}

-- END STREAM CHUNK --
stop_reason=stop

Also, just to be sure, run it to raw minimax api without any in-between proxy:
2286bead-3561-441f-b0f4-b46113faff12

Console logs in attached file, but it’s just some timeout warnings.
cursor_console.txt (5.8 KB)

Great, thanks for the detailed logs and format examples! I can see that both providers return valid usage data (GLM includes it in the final chunk along with finish_reason, and Minimax sends it in a separate chunk).

I’m forwarding the request IDs and the format examples to the team so they can investigate the parsing issue. This should help us add support for these providers in our token tracking system.