Max mode vs non-max mode (context max, not thinking max)

Hey, this is expected behavior. Max Mode always applies a 1.2x multiplier to the API cost, no matter how much context is actually used in the conversation.

From the docs: Max Mode | Cursor Docs

Max Mode consumes usage at 1.2x the normal API rate for the selected model.

The 1.2x is the fee for access to Max Mode features like extended context, subagents with models other than Composer-1, and image generation on request-based plans, not for actually going over the base context window.

If you are not using subagents or image generation, it is usually easier to keep Max Mode off. Colin explains it in detail here: Claude opus 4.5: Max vs Default mode - #3 by Colin