I run small proxy to intercept requests to the api and force correct temperature/reparse thinking for GLM, but the problem persists with and without the proxy, and for Minimax I do not change anything in upstream response, so that should not matter.
GLM:
abcc039f-ce29-4f46-a968-0d936fd408d5
Example of format return
-- PARSED STREAM CHUNK 126 len=611 --
data: {"id": "20260108210546e4b5a3ee404646de", "created": 1767877546, "object": "chat.completion.chunk", "model": "glm-4.7", "choices": [{"index": 0, "delta": {"role": "assistant", "content": " if"}}]}
data: {"id": "20260108210546e4b5a3ee404646de", "created": 1767877546, "object": "chat.completion.chunk", "model": "glm-4.7", "choices": [{"index": 0, "delta": {"role": "assistant", "content": " needed"}}]}
data: {"id": "20260108210546e4b5a3ee404646de", "created": 1767877546, "object": "chat.completion.chunk", "model": "glm-4.7", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "."}}]}
-- END PARSED STREAM CHUNK --
-- PARSED STREAM CHUNK 127 len=430 --
data: {"id": "20260108210546e4b5a3ee404646de", "created": 1767877546, "object": "chat.completion.chunk", "model": "glm-4.7", "choices": [{"index": 0, "finish_reason": "stop", "delta": {"role": "assistant", "content": ""}}], "usage": {"prompt_tokens": 48230, "completion_tokens": 916, "total_tokens": 49146, "prompt_tokens_details": {"cached_tokens": 40184}, "completion_tokens_details": {"reasoning_tokens": 439}}}
data: [DONE]
-- END PARSED STREAM CHUNK --
Minimax:
7044ea63-2968-4240-8276-e873c1007e72
-- STREAM CHUNK 25 len=505 --
data: {"id":"05aedf5a5e50f5c32c66d34663da7ccc","choices":[{"finish_reason":"stop","index":0,"delta":{"content":")\n- Both systems operate independently - the database logging is async and won't fail if database is unavailable","role":"assistant","name":"MiniMax AI","audio_content":""}}],"created":1767877722,"model":"MiniMax-M2.1","object":"chat.completion.chunk","usage":null,"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"output_sensitive_int":0}
-- END STREAM CHUNK --
-- STREAM CHUNK 26 len=384 --
data: {"id":"05aedf5a5e50f5c32c66d34663da7ccc","choices":[],"created":1767877722,"model":"MiniMax-M2.1","object":"chat.completion.chunk","usage":{"total_tokens":34048,"total_characters":0,"prompt_tokens":33311,"completion_tokens":737,"completion_tokens_details":{"reasoning_tokens":298},"prompt_tokens_details":{"cached_tokens":30997}},"base_resp":{"status_code":0,"status_msg":""}}
-- END STREAM CHUNK --
stop_reason=stop
Also, just to be sure, run it to raw minimax api without any in-between proxy:
2286bead-3561-441f-b0f4-b46113faff12
Console logs in attached file, but it’s just some timeout warnings.
cursor_console.txt (5.8 KB)