Cursor Python SDK does not emit `shell-output-delta` for local shell stdout

Where does the bug appear (feature/product)?

Cursor SDK

Describe the Bug

The Cursor Python SDK documentation lists ShellOutputDeltaUpdate as one of the
raw InteractionUpdate variants delivered through SendOptions(on_delta=...).
For a local SDK agent running the built-in shell tool, the Python SDK emits:

  • tool-call-started
  • tool-call-completed

but does not emit any shell-output-delta updates while the shell process is
running.

The completed tool call contains the full stdout in
tool_call.result.value.stdout, so the shell command is definitely executed.
However, callers cannot render live shell output in a streaming UI because
stdout is only visible after the tool exits.

Steps to Reproduce

The repro script in this folder is independent from the application that found
the issue. It imports only cursor_sdk and standard library modules.

It creates:

  • a temporary workspace;
  • a temporary HOME;
  • a tiny git repository in the temporary workspace.

from any environment with cursor-sdk installed:

export CURSOR_API_KEY=...
python repro_shell_output_delta_missing.py

The script exits with:

  • 0 if at least one shell-output-delta is observed;
  • 1 if the shell tool completes and returns stdout, but no
    shell-output-delta is observed;
  • 2 if CURSOR_API_KEY is missing;
  • 3 if the repro is indeterminate, for example no completed shell tool call is
    observed before the timeout.

repro_shell_output_delta_missing.py:

#!/usr/bin/env python3
"""Reproduce missing shell-output-delta updates in the Cursor Python SDK.

The script creates a temporary workspace and a temporary HOME, sends a prompt
that asks the local SDK agent to run a shell command that prints one line per
second, and records raw on_delta update types.

It imports only cursor_sdk and standard library modules. It does not read real
project settings, real user settings, or print CURSOR_API_KEY.
"""

from __future__ import annotations

import argparse
import dataclasses
import importlib.metadata
import json
import os
import subprocess
import sys
import tempfile
import threading
import time
from pathlib import Path
from typing import Any

from cursor_sdk import Agent, LocalAgentOptions, SendOptions


def main() -> int:
    parser = argparse.ArgumentParser(
        description="Reproduce missing shell-output-delta updates for local Cursor Python SDK shell tools."
    )
    parser.add_argument("--model", default="default", help="Cursor model id.")
    parser.add_argument("--ack-count", type=int, default=5, help="Number of ack lines the shell command should print.")
    parser.add_argument("--ack-sleep", type=float, default=1.0, help="Seconds to sleep between ack lines.")
    parser.add_argument("--timeout", type=float, default=45.0, help="Seconds to wait before cancelling the run.")
    parser.add_argument("--keep-temp", action="store_true", help="Keep the temporary fixture for inspection.")
    args = parser.parse_args()

    if not os.environ.get("CURSOR_API_KEY"):
        print("CURSOR_API_KEY is required", file=sys.stderr)
        return 2

    with tempfile.TemporaryDirectory(prefix="cursor-sdk-shell-output-delta-") as tmp:
        root = Path(tmp)
        workspace = root / "workspace"
        fake_home = root / "home"
        workspace.mkdir(parents=True)
        fake_home.mkdir(parents=True)
        git_init(workspace)

        old_home = os.environ.get("HOME")
        os.environ["HOME"] = str(fake_home)
        try:
            report = run_case(
                workspace=workspace,
                model=args.model,
                ack_count=args.ack_count,
                ack_sleep=args.ack_sleep,
                timeout=args.timeout,
            )
        finally:
            if old_home is None:
                os.environ.pop("HOME", None)
            else:
                os.environ["HOME"] = old_home

        report["fixture"] = {
            "workspace": "$TEMP_WORKSPACE",
            "home": "$TEMP_HOME",
            "ack_count": args.ack_count,
            "ack_sleep_seconds": args.ack_sleep,
        }
        print(json.dumps(report, ensure_ascii=False, indent=2))

        if args.keep_temp:
            print(f"Temporary fixture kept at: {root}", file=sys.stderr)
            input("Press Enter after inspecting the fixture to remove it...")

    if report["saw_shell_output_delta"]:
        return 0
    if report["saw_shell_tool_completed"] and report["completed_shell_stdout"]:
        return 1
    return 3


def run_case(*, workspace: Path, model: str, ack_count: int, ack_sleep: float, timeout: float) -> dict[str, Any]:
    started_at = time.monotonic()
    deltas: list[dict[str, Any]] = []
    messages: list[dict[str, Any]] = []
    errors: list[str] = []
    done = threading.Event()
    lock = threading.Lock()

    def elapsed() -> float:
        return round(time.monotonic() - started_at, 3)

    def on_delta(delta: Any) -> None:
        with lock:
            deltas.append({"t": elapsed(), **summarize_delta(delta)})

    prompt = build_prompt(ack_count=ack_count, ack_sleep=ack_sleep)

    with Agent.create(
        model=model,
        api_key=os.environ["CURSOR_API_KEY"],
        local=LocalAgentOptions(cwd=str(workspace), setting_sources=[]),
    ) as agent:
        run = agent.send(prompt, options=SendOptions(on_delta=on_delta))

        def consume_messages() -> None:
            try:
                for index, message in enumerate(run.messages()):
                    with lock:
                        messages.append(
                            {
                                "t": elapsed(),
                                "index": index,
                                "type": str(getattr(message, "type", type(message).__name__)),
                                "status": str(getattr(message, "status", "") or ""),
                            }
                        )
            except BaseException as exc:
                errors.append(f"{type(exc).__name__}: {exc}")
            finally:
                done.set()

        thread = threading.Thread(target=consume_messages, daemon=True)
        thread.start()

        deadline = time.monotonic() + timeout
        while not done.wait(0.1):
            with lock:
                saw_completed = any(d.get("type") == "tool-call-completed" and d.get("tool_type") == "shell" for d in deltas)
                saw_assistant_text = any(d.get("type") == "text-delta" for d in deltas)
            if saw_completed and saw_assistant_text:
                break
            if time.monotonic() >= deadline:
                break

        if not done.is_set():
            try:
                run.cancel()
            except BaseException as exc:
                errors.append(f"cancel_error: {type(exc).__name__}: {exc}")
            done.wait(2.0)

    with lock:
        observed_deltas = list(deltas)
        observed_messages = list(messages)

    shell_completed = [
        d for d in observed_deltas if d.get("type") == "tool-call-completed" and d.get("tool_type") == "shell"
    ]
    shell_started = [d for d in observed_deltas if d.get("type") == "tool-call-started" and d.get("tool_type") == "shell"]
    shell_output_deltas = [d for d in observed_deltas if d.get("type") == "shell-output-delta"]

    return {
        "cursor_sdk_version": package_version("cursor-sdk"),
        "model": model,
        "saw_shell_output_delta": bool(shell_output_deltas),
        "saw_shell_tool_start": bool(shell_started),
        "saw_shell_tool_completed": bool(shell_completed),
        "completed_shell_stdout": first_nonempty(d.get("stdout") for d in shell_completed),
        "shell_output_deltas": shell_output_deltas,
        "delta_types": [str(d.get("type", "")) for d in observed_deltas],
        "deltas": observed_deltas,
        "messages": observed_messages,
        "errors": errors,
    }


def build_prompt(*, ack_count: int, ack_sleep: float) -> str:
    command = (
        "python -c \"import time; "
        f"[print('ack %d' % i, flush=True) or time.sleep({ack_sleep!r}) for i in range(1, {ack_count + 1})]\""
    )
    return (
        "Use the shell tool to run exactly this command, then reply exactly DONE.\n\n"
        f"{command}\n\n"
        "Do not edit files. Do not replace the command with another command."
    )


def summarize_delta(delta: Any) -> dict[str, Any]:
    data = object_to_jsonable(delta)
    delta_type = str(data.get("type", type(delta).__name__))
    summary: dict[str, Any] = {"type": delta_type}

    tool_call = data.get("tool_call") if isinstance(data.get("tool_call"), dict) else {}
    if tool_call:
        summary["tool_type"] = tool_call.get("type")
        summary["tool_name"] = tool_call.get("name")
        result = tool_call.get("result") if isinstance(tool_call.get("result"), dict) else {}
        value = result.get("value") if isinstance(result.get("value"), dict) else {}
        stdout = value.get("stdout")
        stderr = value.get("stderr")
        if isinstance(stdout, str):
            summary["stdout"] = stdout
            summary["stdout_len"] = len(stdout)
        if isinstance(stderr, str):
            summary["stderr"] = stderr
            summary["stderr_len"] = len(stderr)

    for key in ("text", "stdout", "stderr", "output"):
        value = data.get(key)
        if isinstance(value, str) and value:
            summary[key] = value
            summary[f"{key}_len"] = len(value)

    return summary


def object_to_jsonable(value: Any) -> Any:
    if dataclasses.is_dataclass(value):
        return object_to_jsonable(dataclasses.asdict(value))
    if isinstance(value, dict):
        return {str(k): object_to_jsonable(v) for k, v in value.items()}
    if isinstance(value, (list, tuple)):
        return [object_to_jsonable(v) for v in value]
    if isinstance(value, (str, int, float, bool)) or value is None:
        return value
    if hasattr(value, "__dict__"):
        return object_to_jsonable(vars(value))
    return repr(value)


def first_nonempty(values: Any) -> str:
    for value in values:
        if isinstance(value, str) and value:
            return value
    return ""


def package_version(name: str) -> str:
    try:
        return importlib.metadata.version(name)
    except importlib.metadata.PackageNotFoundError:
        return "unknown"


def git_init(workspace: Path) -> None:
    subprocess.run(["git", "init", "-q"], cwd=str(workspace), check=True)
    (workspace / "README.md").write_text("# Cursor SDK shell-output-delta repro\n", encoding="utf-8")
    subprocess.run(["git", "add", "README.md"], cwd=str(workspace), check=True)
    subprocess.run(
        ["git", "-c", "user.name=Repro", "-c", "[email protected]", "commit", "-qm", "init"],
        cwd=str(workspace),
        check=True,
    )


if __name__ == "__main__":
    raise SystemExit(main())

Expected Behavior

Expected

For a local shell command that prints one line per second, on_delta should emit
one or more shell-output-delta updates before tool-call-completed, or the SDK
documentation should clarify that shell stdout is not streamed for local Python
SDK agents.

Observed

With cursor-sdk==0.1.5, a command like:

python -c "import time; [print('ack %d' % i, flush=True) or time.sleep(1.0) for i in range(1, 6)]"

produces a tool-call-completed update with:

{
  "stdout": "ack 1\nack 2\nack 3\nack 4\nack 5\n"
}

but no shell-output-delta update is observed during execution.

Operating System

Linux

Version Information

cursor-sdk==0.1.5

Additional Information

Representative Output

A failing run looks like:

{
  "cursor_sdk_version": "0.1.5",
  "saw_shell_output_delta": false,
  "saw_shell_tool_start": true,
  "saw_shell_tool_completed": true,
  "completed_shell_stdout": "ack 1\nack 2\nack 3\nack 4\nack 5\n",
  "delta_types": [
    "thinking-delta",
    "token-delta",
    "tool-call-started",
    "tool-call-completed",
    "text-delta"
  ]
}

The exact thinking/text chunks vary by model. The important signal is that the
completed shell tool result contains the expected stdout, while
saw_shell_output_delta remains false.

Questions

Is ShellOutputDeltaUpdate(type="shell-output-delta") expected to be emitted for
local Python SDK shell tool stdout?

If yes, this appears to be a local SDK streaming bug: stdout is returned only in
the completed tool result, not as live shell-output deltas.

If no, what is the recommended pattern for a streaming UI to render long-running
shell output before the command exits?

Does this stop you from using Cursor

Sometimes - I can sometimes use Cursor

Hi Shurui! First of all, I’ve been seeing the effort you put into identifying and reporting these SDK bugs. You have close to a 100% hit rate with clear diagnostics and repro scripts, and I really appreciate the clarity.

This is a confirmed gap. The ShellOutputDeltaUpdate type exists in both the TypeScript and Python SDK type systems, and the conversion logic is in place, but for model-initiated shell tool calls (the standard path when the LLM runs a shell command), incremental output events are currently filtered out before reaching on_delta. Only the final stdout appears in the tool-call-completed update.

There’s no workaround right now for streaming shell output before the command exits via the on_delta callback. I’ve filed this with our SDK team and we’ll be tracking this thread.