Getting "Unauthorized User API key" for valid credentials against a self-hosted LLM

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

I am no longer able to chat with my own self-hosted model in “Ask” mode due to “Unauthorized User API key” errors. For a couple of days ago, it worked just fine, now it does not.

The custom model with the associated key works fine at other places, such as Cline/Kilo Code.

Steps to Reproduce

If required, I can provide separately the custom OpenAI endpoint and the API key for testing.

Expected Behavior

In “Ask” mode, I should be able to chat with my own model that is hosted on a custom endpoint that implements OpenAI API specification.

Operating System

Windows 10/11

Current Cursor Version (Menu → About Cursor → Copy)

Version: 1.7.40 (user setup)
VSCode Version: 1.99.3
Commit: df79b2380cd32922cad03529b0dc0c946c311850
Date: 2025-10-09T02:55:11.735Z
Electron: 34.5.8
Chromium: 132.0.6834.210
Node.js: 20.19.1
V8: 13.2.152.41-electron.0
OS: Windows_NT x64 10.0.22631

For AI issues: which model did you use?

gpt-4o

For AI issues: add Request ID with privacy disabled

Request ID: b7ca605b-464e-42e0-a300-58f90bad9d69

Additional Information

If required, I can provide separately the custom OpenAI endpoint and the API key for testing.

Does this stop you from using Cursor

Yes - Cursor is unusable

1 Like

Tagging my colleague @condor who might be able to help you with this issue

1 Like

@syllil thank you for the detailed bug report.

Is the OpenAI compatible AI API publicly reachable? If yes please post only the URL here.
(will send you DM in case the URL is sensitive)

Is the model gpt-4o also available on that provider or is the model name there different?
(in some cases they version model names with a date suffix)

Does the issue occur on existing chats or also on new chats?

Could you please also try a Custom Mode and let me know if that improves the access.

Hi, @condor!

Yes, we are using Azure OpenAI models that are exposed through our proxy that implements the OpenAI API.

I could send you the base URL in private due to sensitivity from my organization’s point of view.

I’ll let you know soon about whether it did work with a custom mode or not.

Thank you for helping me out here in advance!

Hi again, @condor!

I tested again with a Custom Mode in my Cursor editor and, unfortunately, it did not make any difference.

@condor In case it helps isolate the issue:

I haven’t noticed any changes when using my proxy which also implements the OpenAI API, with the only difference that it uses model names like gpt-[high/medium/low/minimal] instead of the reported gpt-4o.

I’ve checked with 1.7.44 and 1.7.46, and Cursor keeps doing the same POST /chat/completions requests as usual and it works with ask/agent/custom, which makes me think the issue is more related with the proxy.

@syllil Are you using a closed source proxy, or is it something you can share?

@gabrii I checked with my company and I was allowed to share the URL with you.

The base URL that I type to override the OpenAI’s default base URL is https://openai.softronic.ai/v1.

@syllil Thanks for sharing, interesting service! The issue seems to be that Azure stopped supporting the /completions API for gpt-4o (and any other models except gpt-3.5):

> curl https://openai.softronic.ai/v1/completions   -H “Content-Type: application/json”   -H “Authorization: Bearer $OPENAI_API_KEY”   -d ‘{
“model”: “gpt-4o”,
“prompt”: “Say this is a test”,
“max_tokens”: 7,
“temperature”: 0,
“stream”: true
}’

{“error”:{“code”:“OperationNotSupported”,“message”:“The completion operation does not work with the specified model, gpt-4o. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.",“type”:null,“param”:null,"inner_error”:null}}

The only model from azure/softronic that seems to work with /completions is gpt-3.5-turbo-instruct. And neither softronic nor Cursor support BYOK for models that are only served through the /responses API.

Hence the existence of my proxy to serve /responses-only models (such as GPT-5) through the old /completions API which is supported by Cursor.

For this to work, it would require Softronic to implement a proxy similar to mine which would be able to keep serving models through /completions indefinitely.

In case you work at softronic, I’m open to freelance opportunities, or feature sponsors for my project! My email is me at gabrii dot com

This topic was automatically closed 22 days after the last reply. New replies are no longer allowed.