Abnormal usage spike (~52.6M tokens in one entry) while agent output did not match project editorial rules

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

Summary

On 2026-04-16 ~16:43 (local time as shown in my usage table), a single usage row for composer-2-fast recorded ~52.6M tokens, which is orders of magnitude higher than expected for the work being done (revising a set of Markdown chapter files). Other entries on the same day were much smaller (e.g. hundreds of thousands to a few million tokens). This looks abnormal and may indicate miscounting, runaway context, or duplicate aggregation, but I cannot verify without backend detail.

What I was doing
I used Cursor Agent in Cursor IDE to revise book-style Markdown files (*_revised.md) under a writing pipeline rule targeting roughly ~180 lines per chapter (range 140–220 per project skill), with Traditional Chinese (Taiwan) as the house style.

What went wrong (product quality)

The agent initially treated the minimum (140 lines) as a target floor to stop at, producing uniformly ~140-line drafts instead of the project’s intended ~180-line “healthy” length.
The output contained mixed Simplified/Traditional Chinese and other quality issues, requiring additional correction passes (OpenCC normalization + manual edits + expansions).
Why I’m reporting this as a bug
Even if the model output was imperfect, ~52.6M tokens for one timestamped entry is not plausible for “edit ~15 Markdown chapters” unless something in the tooling is pulling in huge context, looping, or metering incorrectly. I’m requesting investigation of whether this is a billing/telemetry bug or an agent/context bug.

Expected behavior

Token usage should scale with actual user-visible edits and reasonable conversation context, not jump to tens of millions for lightweight text work.
If this is WAI, the product should surface per-request breakdown or warnings when context/tool output becomes extremely large.
Actual behavior

One usage record shows ~52.6M tokens for composer-2-fast at 2026-04-16 ~16:43 (billing shows Included, but the magnitude is still alarming).
Evidence

Screenshot of usage table showing the spike (~52.6M tokens) at Apr 16, 04:43 PM for composer-2-fast.
Environment

Product: Cursor IDE
Model shown in usage: composer-2-fast
Workspace: Markdown editorial files on local/iCloud-backed paths (no expectation of scanning huge binary deps for this task)
Request
Please help confirm whether the ~52.6M entry is accurate and, if not, correct metering; if accurate, please advise what underlying mechanism could explain it (e.g., large hidden context, tool loops, indexing). If needed, I can provide additional screenshots and exact timestamps with timezone.

Steps to Reproduce

Steps to Reproduce *
Open Cursor IDE with a workspace containing multiple Markdown files (e.g. ch4.md–ch18.md and corresponding *_revised.md targets) under a project rule that specifies ~140–220 lines for revised chapters (intended “healthy” length around ~180 lines, not pegging at the minimum 140).
Use Agent with model composer-2-fast and instruct it to revise those chapters per the rule: produce revised.md, integrate “reader perspective” blocks, end-of-chapter questions, author note, Traditional Chinese (Taiwan) house style, and avoid forbidden patterns (e.g. bare article filenames, internal persona tags).
Let the agent run to completion across the batch (multiple files), including any follow-up fixes when output quality fails (mixed Simplified/Traditional Chinese, line-count too close to floor, cross-chapter consistency).
After the session, open Usage / Billing history and locate the entry for composer-2-fast around 2026-04-16 ~16:43 (as shown in the usage table).
Observe token totals for that window: other entries the same day may be far smaller, while one row shows an extreme spike (~52.6M tokens).
Note: I can’t reliably reproduce the spike on demand without knowing whether it depends on workspace size/indexing, tool loops, or metering—but the usage table already shows the anomaly at a specific timestamp.

Screenshots / Screen Recordings

Operating System

MacOS

Version Information

Version: 3.1.15
VSCode Version: 1.105.1
Commit: 3a67af7b780e0bfc8d32aefa96b8ff1cb8817f80
Date: 2026-04-15T01:46:06.515Z (1 day ago)
Layout: editor
Build Type: Stable
Release Track: Early Access
Electron: 39.8.1
Chromium: 142.0.7444.265
Node.js: 22.22.1
V8: 14.2.231.22-electron.0
OS: Darwin x64 25.3.0

Does this stop you from using Cursor

Sometimes - I can sometimes use Cursor

I thought it was because of Opus 4.7 I was trying out, but I guess not.

Hey, thanks for the detailed report.

We checked the backend data for your session. Quick summary: the metric is correct, this isn’t a billing bug or double counting.

Here’s what happened: that specific request made 435 tool calls over about 38 minutes. Each agent tool call sends the full context window, about 120K to 185K tokens per call, so in total it’s 435 x about 120K = about 52,5M tokens. That’s the current agent architecture: each step like reading a file, writing, or checking is counted as a separate call with full context.

For the task revise 15 Markdown chapters, 435 tool calls is definitely excessive, and I passed your feedback to the team. The agent likely worked inefficiently and kept rereading and rewriting files in a loop instead of doing it in fewer steps.

A couple notes:

  • All tokens are marked as included under the Pro+ plan, so there’s no extra cost.
  • To avoid spikes like this in the future, you can split the task into smaller batches, like 3 to 5 files at a time instead of 15. That limits how many tool calls happen in one session.

On output quality like mixing Simplified and Traditional Chinese, and sticking to a 140 line minimum instead of a 180 line target, that’s a separate issue with how the model reads range based instructions. I’d recommend writing a clear rule like target about 180 lines as the goal, instead of giving a 140 to 220 range, so the model doesn’t optimize toward the lower bound.

Let me know if you’ve got any questions.