@cursor/sdk 1.0.12 local runs: Opus / GPT-5.4–5.5 / Sonnet 4.5–4.6 / Grok / Codex 5.3 error with empty RunResult

Local SDK Agent.prompt fails for subset of catalog models

Executive summary

With @cursor/sdk 1.0.12, Node.js 24.x, and a valid CURSOR_API_KEY, Cursor.models.list() returns 27 models. For 9 of those IDs, a minimal local Agent.prompt(...) completes with status: "error" and no assistant result text, while 18 models complete with status: "finished" and a normal text response. The failing set is identical across two different API key types tested (“Cursor Cloud Agents” vs “Cursor API Key”).

Environment

Item Value
SDK @cursor/sdk 1.0.12 (project node_modules + global npm align)
Runtime Node.js 24.x (via tsx; not Bun—user avoided Bun due to HTTP/2 / nghttp2 issues)
API Cursor.me, Cursor.models.list, Agent.prompt
Agent mode Locallocal: { cwd: <repo root> }
Auth CURSOR_API_KEY in environment

Steps to Reproduce

  1. Clone or open a repo with @cursor/sdk ^1.0.12 and tsx.

  2. Export a valid CURSOR_API_KEY.

  3. Run the project probe script (or equivalent):

    npm run probe:cursor-models -- --timeout-ms 180000
    

    (scripts/cursor-sdk-model-probe.ts — sequential Agent.prompt per catalog entry, default/first variant params from ModelListItem.)

  4. Prompt text (intentionally strict): ask the model to reply with the single word ok.

  5. Success criterion in harness: result.status === "finished" and result.result contains substring ok.

Results (aggregate)

Outcome Count Model IDs
Pass 18 default, composer-2, composer-1.5, gpt-5.2, gemini-3.1-pro, gpt-5.4-mini, gpt-5.4-nano, claude-haiku-4-5, gpt-5.3-codex-spark, gpt-5.2-codex, gpt-5.1-codex-max, gpt-5.1, gemini-3-flash, gpt-5.1-codex-mini, claude-sonnet-4, gpt-5-mini, gemini-2.5-flash, kimi-k2.5
Fail (status: "error", empty result) 9 gpt-5.3-codex, claude-sonnet-4-6, gpt-5.5, claude-opus-4-7, gpt-5.4, claude-opus-4-6, claude-opus-4-5, grok-4.3, claude-sonnet-4-5

Expected Behavior

Expected Actual
Models returned by Cursor.models.list() should be usable for local Agent.prompt with the same API key, or the API should surface a clear upfront constraint (e.g. cloud-only, plan-gated, local-unsupported) so automation can branch. 9 IDs appear in the catalog but consistently return status: "error" with empty final result for this minimal prompt, without an obvious discriminant in the list response.

Additional notes for Cursor triage

  1. Variant selection: The harness uses the default catalog variant when present, else the first variant—mirrors typical “pick a working preset” behavior. If certain variants are required for local runs, documenting that on ModelListItem / errors would help.

  2. Harness UX: When status === "error", the internal probe script previously labeled failures as “unexpected text”; those are failed runs, not non-ok wording. Error details (if any) should be taken from SDK/run objects or logs, not that summary string.

  3. Repro stability: Same 9 failures observed after rotating to a different API key class (Cursor API Key vs Cursor Cloud Agents), suggesting account-wide or product behavior rather than a single compromised key.

  4. Artifacts: Full JSON probe output can be attached separately (redact keys); machine log path used locally: /tmp/symphony-cursor-probe-results.txt (optional).

Suggested questions for Cursor

  • Are the 9 failing IDs expected to be unsupported for local SDK agents with certain keys or plans?
  • Should Cursor.models.list() annotate local vs cloud-only / gated models?
  • What is the recommended way to retrieve a structured error when RunResult.status === "error" (for CI and runners)?

Operating System

MacOS

Version Information

Version: 3.2.21
VSCode Version: 1.105.1
Commit: 806df57ed3b6f1ee0175140d38039a38574ec720
Date: 2026-05-03T01:46:14.413Z
Layout: glass
Build Type: Stable
Release Track: Default
Electron: 39.8.1
Chromium: 142.0.7444.265
Node.js: 22.22.1
V8: 14.2.231.22-electron.0
OS: Darwin arm64 25.3.0

For AI issues: which model did you use?

composer-2-fast

For AI issues: add Request ID with privacy disabled

ffe97943-d6a2-4939-ac65-a9efb6e474fe

Does this stop you from using Cursor

No - Cursor works, but with this issue

Thank you for the thorough report and testing matrix.

This seems like a bug. The 9 failing models require a configuration (“Max Mode”) that the SDK’s local runtime doesn’t currently support for your plan type. The SDK accepts these model IDs from Cursor.models.list() but can’t enable the required setting, so the run fails silently with status: "error".

Our team is aware and actively investigating a fix. For now, the 18 passing models (including composer-2, claude-sonnet-4, gemini-3.1-pro, and the full set from your results) are the ones available for local SDK runs on your plan.

Feel free to check back or reach out if you have other questions.

Thank you for the prompt response. I look forward for future iterations of the SDK.