Cursor Agent sends Responses API format to /chat/completions endpoint

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

When using a custom LLM provider via LiteLLM proxy configured with the OpenAI-compatible /chat/completions endpoint, Cursor Agent mode sends requests in OpenAI Responses API format instead of Chat Completions format. LiteLLM’s /chat/completions handler expects standard Chat Completions format, causing 400/500 errors.

Specific issues observed:

  • Request body uses input: […] instead of messages: […]
  • Responses API-only parameters are sent: store, include, prompt_cache_retention, previous_response_id, truncation, reasoning (as a dict), text (as a format object)
  • Tools are sent in flat Responses API format {“type”:“function”,“name”:“…”} instead of nested Chat Completions format {“type”:“function”,“function”:{“name”:“…”}}
  • Non-standard tool types are sent, e.g. {“type”:“custom”,“name”:“ApplyPatch”,…} with a grammar-based format field that has no equivalent in Chat Completions

Steps to Reproduce

  1. In Cursor settings, configure a custom LLM provider pointing to a LiteLLM proxy /chat/completions endpoint
  2. Select a model served by that provider
  3. Open the Agent panel and send any message
  4. Observe the request received by LiteLLM — it will be in Responses API format rather than Chat Completions format

Expected Behavior

Cursor should send requests to the /chat/completions endpoint in standard OpenAI Chat Completions format — specifically:

  • messages: […] as the conversation payload
  • Tools in nested format {“type”:“function”,“function”:{…}}
  • Only Chat Completions-compatible parameters

Operating System

Windows 10/11

Version Information

Version: 2.5.25 (user setup)
VSCode Version: 1.105.1
Commit: 7150844152b426ed50d2b68dd6b33b5c5beb73c0
Date: 2026-02-24T07:17:49.417Z
Build Type: Stable
Release Track: Default
Electron: 39.4.0
Chromium: 142.0.7444.265
Node.js: 22.22.0
V8: 14.2.231.22-electron.0
OS: Windows_NT x64 10.0.26100

For AI issues: which model did you use?

This occurs with any model configured through a custom LiteLLM provider. Confirmed with gpt-5.1 served via Azure OpenAI through LiteLLM proxy.

Additional Information

A workaround is possible but requires monkey-patching LiteLLM’s _read_request_body function at startup to intercept and convert the Responses API format to Chat Completions format before the request reaches the router. The patch needs to handle:

  • Converting input → messages
  • Converting flat tool definitions to nested function format
  • Converting {“type”:“custom”,…} tools to standard function tools
  • Converting reasoning: {“effort”: “…”} → reasoning_effort: “…”
  • Converting text: {“format”: {…}} → response_format
  • Stripping all remaining Responses API-only parameters

The fact that this workaround is needed at all — and its complexity — illustrates how far the Agent request format deviates from the Chat Completions spec on the /chat/completions endpoint.

Does this stop you from using Cursor

No - Cursor works, but with this issue

Hey, thanks for the report, this is a known issue.

What’s happening: when using BYOK with a base URL override, Cursor Agent sends an OpenAI Responses API payload (input, flat tool format, etc.) to the /chat/completions endpoint instead of using the Chat Completions format.

Related threads about the same issue:

The team is aware. There’s no ETA for a fix yet, but your detailed report, especially the breakdown of all the format mismatches, really helps with prioritization.

For now, the main solution is the workaround you already found, intercept and convert the payload on the proxy side. Another approach some users take is to avoid the base URL override and use Cursor’s built-in API routing, if that fits your setup.

1 Like

This topic was automatically closed 22 days after the last reply. New replies are no longer allowed.