Cannot use BYOK thinking models in inline chat

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

Besides similar problems mentioned below, Cursor had disabled thinking models for inline chat, but we could still use them like Gemini models with our own API key but now even with our own key, they are not visible to select for inline chat (probably because they are merged with the Cursor Gemini models)

Anyway, I would really like to bring this back or at least let us use thinking model for inline chat. Almost all models are thinking/hybrid now. @deanrie had forwarded this request but as usual the team isn’t giving any answers back.

and

Steps to Reproduce

add and activate your Gemini API, add a Gemini flash model

Expected Behavior

seeing the Gemini flash model in the inline chat list

Operating System

Windows 10/11

Current Cursor Version (Menu → About Cursor → Copy)

Version: 2.2.36 (user setup)
VSCode Version: 1.105.1
Commit: 55c9bc11e99cedd1fb93fbb7996abf779c583150
Date: 2025-12-18T06:25:21.733Z
Electron: 37.7.0
Chromium: 138.0.7204.251
Node.js: 22.20.0
V8: 13.8.258.32-electron.0
OS: Windows_NT x64 10.0.26200

Does this stop you from using Cursor

No - Cursor works, but with this issue

Hey, thanks for the report.

This is a known limitation - the team intentionally disabled thinking models in inline edit CMD+K due to poor performance. This applies to all thinking models, including Gemini Flash with BYOK.

The team is considering adding model selection for inline edit, but there’s no specific solution or ETA yet.

1 Like

That’s encouraging news @deanrie that you guys are considering it and better align with the competitions. Please keep us updated.

I’m not a fan of wasting Cursor credits on non-thinking Claude for simple inline questions and don’t want to clutter my models list of non thinking models.

If you don’t want to globally enable Cursor thinking models with inline chat, with BYOK at least, we could use Gemini Flash, it was still working pre 2.0.
Never had performance issues nor saw complaints about inline chat.

As the term imply, it’s mainly for quick questions and simple things, staying focused in the editor, without having to waste time open a whole new chat+having to delete etc…

1 Like