Using the SDK, how can I check the context window usage?

azhang · May 24, 2026, 9:40am

Hi everyone,

I’m trying to monitor my 200K context window usage programmatically.

I know the turn-ended event provides a usage object:

{
  "type": "turn-ended", 
  "usage": {
    "inputTokens": 1235,
    "outputTokens": 10,
    "cacheReadTokens": 1230,
    "cacheWriteTokens": 0
  }
}

However, this is an aggregate total for the entire turn (prompt + tool calls + responses).

What I actually need is the exact input token size of an individual request sent to the LLM at any specific step. Is there a way to expose or calculate this payload size so I can accurately track my context limit?

Thanks!

deanrie · May 26, 2026, 1:43pm

Hey, as of right now the SDK gives three token-related events, and none of them cover what you want:

turn-ended.usage is the aggregate for the whole turn, which you already know
token-delta is a heuristic running count during streaming (chars/4), fine for a progress bar but not good for tracking the limit
step-started / step-completed include stepId and duration, but no tokens

There’s currently no exact input token size for a single LLM request inside a multi-step turn in the SDK. This is a valid feature request, I’ll log it internally for the Async Agents team so they can add per-step usage to step-completed (or as a separate event). I can’t give an ETA.

As a temporary workaround to track getting close to 200K, you can treat inputTokens + cacheReadTokens + cacheWriteTokens from turn-ended as a lower bound for the context size at the end of the turn (the last step is usually closest to the peak). Not perfect, but it gives a rough boundary between turns.

azhang · May 26, 2026, 9:25pm

I tried. If the turn involves many tool calls, the inputTokens + cacheReadTokens + cacheWriteTokens can be easily much larger than 200k, because each turn contains many LLM requests.

deanrie · May 27, 2026, 6:07am

Fair point, my workaround doesn’t work here. In a multi-step turn, each LLM call re-sends almost the whole context, and the turn-ended total counts that as new tokens. So it can easily go over 200K, even if the actual peak context was lower.

Honest answer: with the current SDK, you can’t reliably calculate the context window between steps. We’d need per-step usage in step-completed (or a separate event), and that doesn’t exist yet.

I’ll log this as a feature request for the team. I can’t share an ETA. If there’s an update, we’ll reply in the thread.

deanrie · June 17, 2026, 10:48am

Hey @azhang, coming back with an update on what we talked about earlier.

We looked into adding per-step token usage to the SDK, but for now we decided not to do it, so there won’t be separate per-request context tracking anytime soon. The supported signal stays the same: aggregate turn-ended.usage. Like you pointed out, for a multi-step turn it overestimates the real context size since each step re-sends almost the whole context and those tokens get summed again. Because of that, you can’t reliably calculate the peak context window between steps from it.

To be honest, with the current SDK there isn’t accurate per-step context tracking.

If this is a blocker for what you’re building, tell us more about your use case and we can revisit it.

azhang · June 18, 2026, 6:21pm

Because my system continuously reuses the same agent across multiple rounds of a “modify-and-review” loop to save tokens—rather than spinning up a brand-new agent for every single “Implement agent” and “Review agent” task—I strictly need to monitor the remaining capacity of the current agent’s context window.

If the remaining context budget runs low, I would typically choose to either:

Initialize a new Agent instead of reusing the existing one, or
Apply a custom compaction method.

Without the ability to monitor the remaining context window under our current reuse approach, the system might trigger Cursor’s built-in compaction method automatically. The results of that automated compaction may not align with my requirements, as I would much prefer Option 1 (creating a new Agent entirely) in that scenario.

deanrie · June 20, 2026, 11:22am

Hey, thanks for writing out the use case. It makes it clearer what’s blocking you.

In short: with the current SDK, you still can’t reliably calculate the remaining context window between steps. turn-ended.usage is still the only supported signal, and for a multi-step turn it overestimates the real context since each step re-sends almost the whole context and those tokens get counted again. Right now, step-completed doesn’t include a per-step input token count.

Your setup of reusing one agent in a modify-and-review loop, and wanting to decide yourself when to create a new agent or apply custom compaction instead of the built-in one, is a clear and concrete reason. I’ll pass it to the team as justification for the request for per-step usage and better SDK observability. I can’t promise it’ll get implemented, and I can’t share an ETA, but details like this really help make the case for the feature.

If there’s an update, I’ll reply in the thread.

deanrie · July 6, 2026, 1:20pm

Hey @azhang, quick update on this.

The latest SDK @cursor/sdk now returns token usage more directly: during streaming you get per-turn usage events via run.stream(), and on a finished run you get the cumulative total via run.wait() or run.usage.

That said, the exact per-step input token count in the middle of a turn still isn’t exposed. So, like you pointed out in #6, the aggregate still overestimates the real context in a multi-step turn, where the context is re-sent on each step and tokens get re-counted. step-completed currently only includes stepId and stepDurationMs, with no token info.

For your reuse-and-compact loop, is cumulative and per-turn usage enough signal to decide when to spin up a new agent, or do you strictly need the exact per-step context size? This directly helps us prioritize deeper context-usage reporting. I can’t give an ETA for per-step yet.

Topic		Replies	Views
Agent context lifecycle (observability) Feature Requests context , long-running-agents , cursor-sdk	2	73	July 9, 2026
[SDK] Local agents do not retain conversation context between agent.send() calls` Bug Reports context , cloud-agents , cursor-sdk	8	226	June 4, 2026
CLI: emit ACP usage_update so clients like Zed can show a context window indicator Feature Requests context , cli , acp	1	44	July 10, 2026
Accurate per-run token accounting (billing-grade) Feature Requests cursor-sdk	3	70	June 24, 2026
Clearer usage statistics Feature Requests	9	161	July 20, 2026

Using the SDK, how can I check the context window usage?

Related topics