Cursor is sending Response API requests to Chat Completion API

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

as title described

Steps to Reproduce

How to reproduce:

  1. Fill in your OpenAI API key.
  2. Switch on the Override OpenAI Base URL, but set it to: https://api.openai.com/v1.
  3. Ask a question to GPT-5.1-Codex.

And you’ll see the following error:

Request failed with status code 404: {
  "error": {
    "message": "This is not a chat model and thus not supported in the v1/chat/completions endpoint. Did you mean to use v1/completions?",
    "type": "invalid_request_error",
    "param": "model",
    "code": null
  }
}

And if you put your own override URL in and dump the request, you’ll see it’s actually a Response API request, but it’s sent to your chat completion API endpoint.

Expected Behavior

Yes, it’s definitely good to use the response API as it’s slightly smarter, but please don’t send it to the chat completion endpoint…

Operating System

MacOS

Current Cursor Version (Menu → About Cursor → Copy)

Version: 2.1.39
VSCode Version: 1.105.1
Commit: 60d42bed27e5775c43ec0428d8c653c49e58e260
Date: 2025-11-27T02:30:49.286Z
Electron: 37.7.0
Chromium: 138.0.7204.251
Node.js: 22.20.0
V8: 13.8.258.32-electron.0
OS: Darwin arm64 24.6.0

For AI issues: which model did you use?

GPT-5.1-Codex

Does this stop you from using Cursor

No - Cursor works, but with this issue

Hey, thanks for the report!

The issue: GPT-5/GPT-5.1 Codex models require OpenAI’s /v1/responses API endpoint, but BYOK (Bring Your Own Key) in Cursor currently only supports /v1/chat/completions. This causes the 404 error you’re seeing.

Workaround: Use GPT-5.1 Codex without a custom OpenAI API key - it works fine with Cursor’s built-in API. You can toggle your API key on/off in Settings > Models > API Keys (or use Cmd+Shift+0).

The team is already tracking this limitation. See the related discussion here: With custom model api key get request error

Thanks for the reply! Yeah, it’s almost always better to use the Response API, and the “slightly smarter” is a really noticeable deal, and so are the caching improvements.

In the meantime, I’ve convinced GitHub - Laisky/one-api: OpenRouter’s open‑source alternative. Multi-model/Multi-API-format/Multi-tenant LLM API aggregation platform. to support handling the Response API request payload sent to Chat Completion API endpoints, so this can be a temporary workaround.

I’m actually very keen to see a larger migration to use the Response API, at least for all GPT-5 and 5.1 model families.