This is getting out of hand

Hi everyone, I will address the main topic:

  • Usage listed is per Request which may contain multiple tool calls
  • We show what the AI providers report as token usage via API directly.
  • Longer chats or chats with many tool calls accumulate token usage over time.
  • Context used adds up as well, reduce any unnecessary attachments, rules, MCPs,…
  • Thinking models use up more tokens than non-thinking models, and
  • Heavier models like Opus cost 5x the amount of Sonnet.
  • Use Auto where possible to reduce consumption.

More on token usage and how to optimize it to get more out of your plan