Hi everyone, I will address the main topic:
- Usage listed is per Request which may contain multiple tool calls
- We show what the AI providers report as token usage via API directly.
- Longer chats or chats with many tool calls accumulate token usage over time.
- Context used adds up as well, reduce any unnecessary attachments, rules, MCPs,…
- Thinking models use up more tokens than non-thinking models, and
- Heavier models like Opus cost 5x the amount of Sonnet.
- Use Auto where possible to reduce consumption.
More on token usage and how to optimize it to get more out of your plan