Local SDK Agent.prompt fails for subset of catalog models
Executive summary
With @cursor/sdk 1.0.12, Node.js 24.x, and a valid CURSOR_API_KEY, Cursor.models.list() returns 27 models. For 9 of those IDs, a minimal local Agent.prompt(...) completes with status: "error" and no assistant result text, while 18 models complete with status: "finished" and a normal text response. The failing set is identical across two different API key types tested (“Cursor Cloud Agents” vs “Cursor API Key”).
Environment
| Item | Value |
|---|---|
| SDK | @cursor/sdk 1.0.12 (project node_modules + global npm align) |
| Runtime | Node.js 24.x (via tsx; not Bun—user avoided Bun due to HTTP/2 / nghttp2 issues) |
| API | Cursor.me, Cursor.models.list, Agent.prompt |
| Agent mode | Local — local: { cwd: <repo root> } |
| Auth | CURSOR_API_KEY in environment |
Steps to Reproduce
-
Clone or open a repo with
@cursor/sdk^1.0.12 andtsx. -
Export a valid
CURSOR_API_KEY. -
Run the project probe script (or equivalent):
npm run probe:cursor-models -- --timeout-ms 180000(
scripts/cursor-sdk-model-probe.ts— sequentialAgent.promptper catalog entry, default/first variant params fromModelListItem.) -
Prompt text (intentionally strict): ask the model to reply with the single word
ok. -
Success criterion in harness:
result.status === "finished"andresult.resultcontains substringok.
Results (aggregate)
| Outcome | Count | Model IDs |
|---|---|---|
| Pass | 18 | default, composer-2, composer-1.5, gpt-5.2, gemini-3.1-pro, gpt-5.4-mini, gpt-5.4-nano, claude-haiku-4-5, gpt-5.3-codex-spark, gpt-5.2-codex, gpt-5.1-codex-max, gpt-5.1, gemini-3-flash, gpt-5.1-codex-mini, claude-sonnet-4, gpt-5-mini, gemini-2.5-flash, kimi-k2.5 |
Fail (status: "error", empty result) |
9 | gpt-5.3-codex, claude-sonnet-4-6, gpt-5.5, claude-opus-4-7, gpt-5.4, claude-opus-4-6, claude-opus-4-5, grok-4.3, claude-sonnet-4-5 |
Expected Behavior
| Expected | Actual |
|---|---|
Models returned by Cursor.models.list() should be usable for local Agent.prompt with the same API key, or the API should surface a clear upfront constraint (e.g. cloud-only, plan-gated, local-unsupported) so automation can branch. |
9 IDs appear in the catalog but consistently return status: "error" with empty final result for this minimal prompt, without an obvious discriminant in the list response. |
Additional notes for Cursor triage
-
Variant selection: The harness uses the default catalog variant when present, else the first variant—mirrors typical “pick a working preset” behavior. If certain variants are required for local runs, documenting that on
ModelListItem/ errors would help. -
Harness UX: When
status === "error", the internal probe script previously labeled failures as “unexpected text”; those are failed runs, not non-okwording. Error details (if any) should be taken from SDK/run objects or logs, not that summary string. -
Repro stability: Same 9 failures observed after rotating to a different API key class (
Cursor API KeyvsCursor Cloud Agents), suggesting account-wide or product behavior rather than a single compromised key. -
Artifacts: Full JSON probe output can be attached separately (redact keys); machine log path used locally:
/tmp/symphony-cursor-probe-results.txt(optional).
Suggested questions for Cursor
- Are the 9 failing IDs expected to be unsupported for local SDK agents with certain keys or plans?
- Should
Cursor.models.list()annotate local vs cloud-only / gated models? - What is the recommended way to retrieve a structured error when
RunResult.status === "error"(for CI and runners)?
Operating System
MacOS
Version Information
Version: 3.2.21
VSCode Version: 1.105.1
Commit: 806df57ed3b6f1ee0175140d38039a38574ec720
Date: 2026-05-03T01:46:14.413Z
Layout: glass
Build Type: Stable
Release Track: Default
Electron: 39.8.1
Chromium: 142.0.7444.265
Node.js: 22.22.1
V8: 14.2.231.22-electron.0
OS: Darwin arm64 25.3.0
For AI issues: which model did you use?
composer-2-fast
For AI issues: add Request ID with privacy disabled
ffe97943-d6a2-4939-ac65-a9efb6e474fe
Does this stop you from using Cursor
No - Cursor works, but with this issue