Initial token count too high (18.5k) for a new conversation

I created a new empty folder, opened it in Cursor, and just said “hello” to the AI. It cost 18.5k tokens.

My setup is very minimal — barely any rules, no user-configured MCP servers (the AI claims there is a built-in cursor-ide-browser, but I can’t control it), and no skills.

18.5k tokens for a simple greeting seems abnormal. Is there any way to reduce or control this? I shouldn’t be spending this much just to say hello.

Request ID: d3b6efec-8952-499b-87ef-b9459b92597c

Environment:

  • Cursor: 2.5.20 (user setup)

  • VSCode Version: 1.105.1

  • Commit: 511523af765daeb1fa69500ab0df5b6524424610

  • Build Type: Stable

  • Release Track: Default

  • OS: Windows_NT x64 10.0.19045

Hey, good question. The ~18.5k tokens for a “hello” in an empty project is actually expected baseline behavior, this isn’t a bug.

Every agent request includes overhead you can’t fully remove:

  • Cursor’s internal system prompt, which tells the AI how to behave
  • Built-in tool definitions, like file editing, terminal, search, etc., even if you don’t use MCP
  • Any rules, even minimal ones

For reference, other AI coding assistants show similar baseline numbers. Claude Code CLI uses ~20k for a simple “hello”, and Codex in Cursor is about the same. So 18.5k is actually on the lower end.

A few things to keep in mind:

  • Most of this is cache read tokens, which are 10x cheaper than regular input tokens
  • The overhead is roughly constant per message, it doesn’t scale linearly with more messages in the same chat because caching kicks in
  • Later messages in the same chat reuse the cached context

To minimize token usage overall, keep chats short and focused on one task, start fresh chats often, and avoid unnecessary rules or MCP servers.

There’s an active discussion with more details in this thread: Saying ‘hello’ uses 122,000 tokens – the cache usage seems inefficient

Let me know if anything else is unclear.