A Cloud Agent run a not-so-difficult task for 11 min, 2.2M tokens, counted as ~100 requests?

Well, this is the first time I use Cloud Agent, and I’m a Pro user with old plan. And I’m just shocked how this is working and it seems totally ridiculous to me…. This could be easily done with one-shot (counted as 2 requests) locally…

So is it expected?

image

Hey, this is expected behavior, even though I get why it looks shocking at first.

A few things are happening here:

Cloud Agents always run in Max Mode. You can’t turn that off. Every step the agent takes like reading files, making edits, running terminal commands, checking output counts as a separate request on older plans that bill per request. An 11 minute session with lots of iterations can easily hit around 100 requests.

Model choice matters a lot. You picked claude-4.6-opus-high-thinking, which is one of the most expensive models available. The high thinking option is especially token hungry.

For simple tasks, it’s better to use a local agent. Cloud Agents are meant for longer tasks you can parallelize, when you want to hand something off and come back later. If you can do it in one pass locally, do it locally.

If you still want to use Cloud Agents, try switching to Composer 2 or Codex 5.3, they’re much cheaper. You can also track cost in real time on your https://cursor.com/dashboard/usage.

Also worth noting, the new usage based billing shows spend in USD instead of request count, which makes it much easier to track.