Fresh bugs with custom model

Eugene_Mischenko · January 13, 2026, 8:55pm

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

Hi!
I’m using the GLM-4.7 model with the OpenAI endpoint and API key replaced. I understand this is some hack, but you promised support for custom models.
First, why can’t I use other models: Claude, Gemini, Composer, Grok, Kimi, etc., if I changed just the endpoint for OpenAI, it’s ok if ChatGPT isn’t available in this case, but not other models. This is really strange, as much as the impossibility to add any model I want, maybe OpenRouter, or Qwen, or anything else.
But today the situation has become much worse. I can’t use a custom model anyway at all: I got an error “Invalid model. The model GLM-4.7 does not work with your current plan or api key.”
I have a Pro Plus subscription. And now I’m hardly considering changing the IDE because of these issues. I have no clue why do you limit paying customers to use only built-in models, just to resell tokens?

Steps to Reproduce

Use a custom model with API key and endpoint entered in OpenAI requisits

Operating System

Windows 10/11

Current Cursor Version (Menu → About Cursor → Copy)

Version: 2.3.35 (user setup)
VSCode Version: 1.105.1
Commit: cf8353edc265f5e46b798bfb276861d0bf3bf120
Date: 2026-01-13T07:39:18.564Z
Electron: 37.7.0
Chromium: 138.0.7204.251
Node.js: 22.20.0
V8: 13.8.258.32-electron.0
OS: Windows_NT x64 10.0.26200

Does this stop you from using Cursor

Yes - Cursor is unusable

deanrie · January 14, 2026, 6:22am

Hey, thanks for the report.

This is a known issue. When “Override OpenAI Base URL” is enabled, it affects all API keys and models, including Cursor’s built-in models (Claude, Gemini, etc.). The team is working on a fix, but for now here’s a workaround:

Turn off “Override OpenAI Base URL” when you want to use Cursor’s standard models
Turn it back on only when you need GLM-4.7
Switch it manually depending on which model you’re using

A similar issue was discussed here for Anthropic: Anthropic models break when Override OpenAI BaseUrl is set

And specifically for GLM-4.7: Cursor Models Fail When Using BYOK OpenAI Key with Overridden Base URL (GLM-4.7)

I’ll pass your details to the team.

Eugene_Mischenko · January 14, 2026, 6:38am

Hi! Thank you for the answer.
But we have two issues here. The first one is when other models are overwritten, the second is “Invalid model. The model GLM-4.7 does not work with your current plan or api key”. The second issue makes Cursor totally unusable with an external model.

Also if follow walkaround for the first issue - each time when I need to enable custom model, I have to input endpoint manualy again - it doesn’t save

deanrie · January 14, 2026, 7:17am

Yes, the second issue (the endpoint not being saved when you toggle the switch) is also a known bug. The team is aware and is working on fixing the whole base URL override system.

Unfortunately, the current workaround is to manually enter the endpoint every time. The only alternative is to save the endpoint in a note or text file and copy it in when you turn the toggle on.

I know that’s inconvenient, which is why we’re planning to add the option to set a separate base URL for each custom model. That should fix both issues.

arigon · January 14, 2026, 9:12pm

I have this same issue. GLM-4.7 works with my Ultra subscription, but changing the switch is pain in the ass. Cursor should use the GLM-4.7 to replace the lousy Composer1 by the way…

Eugene_Mischenko · January 16, 2026, 5:32am

Hi Dean,
Do you have any ETA for these fixes?

deanrie · January 16, 2026, 6:09am

Hey, unfortunately I don’t have an exact ETA. The issue is in our backlog, and the team is working on it, but I can’t share a specific date yet.

I get that the workaround is inconvenient. I’ll let you know here once the fix is out.

theio · January 17, 2026, 7:03pm

This is an issue on your side, GLM should work unless you’re on free plan (BYOK doesn’t work on free plan) or unless you pasted the key/endpoint wrongly

As for endopint pasting each time - disable the “OpenAI API key” toggle to use cursor models, not the “Override…” toggle, that way your endpoint stays saved.

5dc614d7a465a8e654bf · January 28, 2026, 11:07pm

Also facing the same issue today. Oddly enough, GLM-4.7 was working just fine until this morning. Then I restarted cursor for an update and now it doesn’t work. I’m on a pro plan. Cursor Version: 2.4.22 (Universal)

zena_the_coder · January 29, 2026, 7:25am

Also facing this issue, can’t connect any GLM models to cursor. Checked API, it is working ok

Is there any news on ETA when this bug will be fixed?

5dc614d7a465a8e654bf · February 1, 2026, 9:58pm

Receiving this error message now:

Request ID: 5600d4cf-f01b-4d5b-8861-5eefe3107f76
AI Model Not Found Model name is not valid: “GLM-4.7”
F4t: AI Model Not Found Model name is not valid: “GLM-4.7”
at Gmf (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:9095:38263)
at Hmf (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:9095:37251)
at rpf (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:9096:4395)
at fva.run (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:9096:8170)
at async Hyt.runAgentLoop (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:34196:57047)
at async Zpc.streamFromAgentBackend (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:34245:7695)
at async Zpc.getAgentStreamResponse (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:34245:8436)
at async FTe.submitChatMaybeAbortCurrent (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:9170:14575)
at async Ei (vscode-file://vscode-app/Applications/Cursor.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:32994:3808)

Mohammed_DS · February 2, 2026, 9:47am

Same problem

Radius_Kuntoro · February 3, 2026, 5:14pm

Same problem. Using v2.4.27. Unable to use openai/openrouter models

Update: Still broken in v2.4.28

the_laser_unicorn · February 9, 2026, 4:26am

wow, cancelled my cursor sub a month or two ago and came back to try with the z.ai keys.. wow…

BruceW · February 11, 2026, 7:56am

I had the same problem at an older version like 2.4.28. Now I upgrade to 2.4.31 and get a new error message “Free plans can only use Auto. Switch to Auto or upgrade plans to continue.“. Is this still a bug or I need to upgrade my plan to use custom models?

Radius_Kuntoro · February 13, 2026, 8:49pm

Don’t bother, I have subscription and openrouter API key is still broken in Version: 2.4.31

I ended up installing Roo Code extension to use my openrouter key . Also started using Antigravity. It’s still on preview so it has higher limits for free plan.

Update: still broken in Version: 2.4.36

imura · February 24, 2026, 4:45am

Adding another data point — same core issue (BYOK custom model broken), plus context on why this is a blocker for high-volume Team Plan users.

My setup

Model: GLM-5 (via Z.AI, OpenAI-compatible endpoint)
Override OpenAI Base URL: https://api.z.ai/api/coding/paas/v4
OS: Windows 11 (WSL2 Ubuntu 22.04)
Cursor Version: (fill in from Menu → About Cursor → Copy)
Request ID: ec3d329b-1d0e-432a-93f9-c1513facd078

Issue 1: BYOK broken — same as this thread

Every request returns:

Invalid API key. Unauthorized User API key

The API key works fine via direct curl:

curl -s https://api.z.ai/api/coding/paas/v4/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <my_z_ai_api_key>" \
  -d '{"model":"GLM-5","messages":[{"role":"user","content":"ping"}],"max_tokens":10}'
# → HTTP 200, valid response

Dashboard shows User API Key | GLM-5 | 0 tokens | $0.00 — the request hits Cursor’s proxy but auth fails before reaching Z.AI.

Tried: fresh chat, re-adding model, re-entering key, restarting Cursor, multiple API keys. Nothing works.

This also matches #149214 (Custom Model problems) and #152503 (Using GLM-4.7 in Cursor) — nearly two months of reports now.

@deanrie — Is there a fix in progress? Even a rough ETA would help. Without any communicated timeline, it’s impossible to tell whether to wait or migrate away.

Issue 2: Why this is critical — BYOK is the only cost-isolation option on Team Plan, and Cursor Token Fee undermines it

Why I need BYOK in the first place

My monthly usage: ~1.05 billion tokens/month. Peak single prompt: 19.26M tokens ($14.81). This is legitimate senior engineering workload, not abuse.

The Team Plan has no per-user on-demand spending limit (docs say per-member limits are Enterprise-only). Only team-wide caps exist, so a high-volume individual like me risks consuming the team’s entire on-demand budget. The alternative is constantly monitoring the dashboard and self-throttling — which defeats the purpose of an AI coding assistant.

BYOK was my solution: route inference costs to my own provider (Z.AI GLM Coding Plan) so my usage doesn’t impact my team.

BYOK doesn’t solve it — Cursor Token Fee consumes the included credit

Despite routing inference costs externally via BYOK, the Cursor Token Fee ($0.25/M on ALL tokens including BYOK, per Team Pricing docs — “Cursor Token Fee” section) at my volume = ~$263/month for the fee alone.

The inference cost is on Z.AI’s side, yet the Cursor Token Fee alone instantly exhausts the Team Plan’s included credit ($20/user/month) and massively exceeds it. The whole point of BYOK is cost isolation, but the Token Fee negates that.

Cursor Token Fee: stated coverage vs. actual costs

The fee covers three items:

Semantic search (see Cursor docs — “Semantic search” page)
Custom model execution (Tab, Apply, etc.) (see Cursor blog — “instant-apply” post)
Infrastructure

However, examining the actual cost of each reveals a significant gap with the $0.25/MTok rate.

Semantic search:

Per the Cursor “semsearch” blog post (Nov 6, 2025) and “secure-codebase-indexing” blog post (Jan 27, 2026), embedding generation happens at indexing time (offline), and unchanged chunks are cached
At inference time, the cost is a similarity query against Turbopuffer (VectorDB) — not an embedding model inference. VectorDB query costs are very cheap
This feature has been included in subscription since Codebase Context v1 in June 2023 — over 2 years. It was first charged as a Token Fee item in the September 2025 Team Plan pricing change
Cursor’s decision to invest in a custom embedding model (semsearch blog: fine-tuned using agent session traces ranked by LLM) was their own business decision. Passing that R&D cost to users as a “processing fee” levied on inference tokens is cost-shifting from a product decision

Tab/Apply:

Apply (instant-apply blog) uses a Llama-3-70b-based fine-tuned model with speculative decoding at ~1000 tokens/sec. A single file rewrite consumes at most a few thousand tokens — a few cents per operation
Tab uses an even lighter model, generating a few dozen to a few hundred tokens per completion — effectively zero cost
I don’t use Tab completion at all — my workflow is Agent prompts only

Infrastructure:

Proxy routing, file sync, etc. Understood as largely fixed costs

Summary:

Token Fee item	Actual cost at inference time	Proportional to inference tokens?
Semantic search	VectorDB query (cheap)	No (scales with codebase size)
Tab	Lightweight inference, dozens of tokens	No (scales with completion count)
Apply	A few thousand tokens, cents per operation	No (scales with file edit count)
Infrastructure	Fixed costs	No

All three items have low actual costs and none are proportional to inference token volume. Yet $0.25/MTok is levied on ALL inference tokens (input + output + cached). At my volume (~1.05B tokens/month), that’s $263/month — but the actual cost of Semantic search is DB query fees, Apply is a few dollars even at hundreds of executions, and Tab is not used at all.

Questions for the Cursor team

The scope of the Cursor Token Fee is unclear. In #148596 (BYOK subtracting from Cursor plan usage), an individual Pro plan user reported that BYOK usage was being deducted from their Pro plan credit, with the dashboard showing User API Key / GLM-4.7 / Cost: Included. @Colin responded: “This is the Cursor Token Fee, which applies to Team plans.” However, the reporter explicitly stated “my Cursor Pro plan” — this was an individual Pro plan report. Documentation only mentions Token Fee for Team plans. Does this response mean that the Cursor Token Fee also applies to individual Pro plan BYOK? If yes, this needs to be documented. If no, then #148596 is a billing bug that needs to be fixed. @deanrie — would appreciate your confirmation on this point as well.
Could BYOK users opt out of fee components for features they don’t use?
Semantic search has been included in subscription since June 2023. The investment in a custom embedding model was Cursor’s decision. Is it appropriate to recoup that cost by levying a per-inference-token fee on users? The actual costs of VectorDB queries and Apply operations appear significantly lower than the $0.25/MTok rate.
Has the team considered billing the Token Fee only on Cursor-side token consumption (Semantic search, Apply) rather than on BYOK inference tokens, or offering a flat monthly rate?

Summary

Issue	Status	Impact
BYOK broken	Regression since ~Jan 2026, no fix timeline communicated	Cannot use BYOK at all
Cursor Token Fee on BYOK	~$263/month at my volume	Inference costs routed externally, yet included credit is instantly consumed by the fee alone. Large gap between fee rate and actual costs of covered items
No per-user spend limit	Team Plan only, Enterprise required	Root cause forcing reliance on BYOK
Token Fee scope unclear	Docs and support responses contradict on Pro plan applicability	Cannot safely rely on BYOK on any plan

Net result: I cannot use Cursor professionally at the level I need to. A fix timeline for BYOK and a response on the Token Fee structure would be greatly appreciated.

References (thread numbers due to new-user link limit)

BYOK / Custom model issues: #148815 (this thread), #149214, #152503, #147218 (staff-acknowledged), #140266, #132572

Billing: #148596, #140467

Official blogs: “Updates to Teams pricing” (Aug 2025), “Clarifying our pricing” (Jun 2025), “semsearch” (Nov 2025), “secure-codebase-indexing” (Jan 2026), “instant-apply”

Official docs: Team Pricing page (Cursor Token Fee section), Semantic search page

imura · February 24, 2026, 5:39am

↑By the way—during this investigation I prioritized accuracy, so I used claude-4.6-opus-max-thinking (the top-tier model). If you do a rough conversion based on API unit pricing, even Sonnet (slightly lower-tier) would still have burned about 53% of the included allowance ($10.60 / $20)… latest models are pricey lol.

Radius_Kuntoro · February 24, 2026, 7:03pm

@imura Just use Roo Code extension if you can.

It’s still broken for me in Version: 2.5.20

Topic		Replies	Views
Does not work with your current plan or API key Bug Reports	5	349	February 10, 2026
Cursor Models Fail When Using BYOK OpenAI Key with Overridden Base URL (GLM-4.7) Bug Reports java	4	1595	January 27, 2026
Cannot use any models if OpenAI base URL overridden Bug Reports	5	807	December 31, 2025
ERROR: The model XXX does not work with your current plan or api key Bug Reports	14	1878	December 21, 2025
Openrouter.ai stopped working Bug Reports	18	2131	March 18, 2025