Seems like the usage quota only applies per model – using up Sonnet doesn't let you switch to Gemini!

I was going through the Expected usage within limits section and noticed something that might not be obvious at first glance.

For each subscription tier (Pro, Pro+, Ultra), the listed usage is per model, not shared across models. For example:

  • Pro: ~225 Sonnet or ~550 Gemini or ~650 GPT-4.1 requests
  • Pro+: ~675 Sonnet or ~1,650 Gemini or ~1,950 GPT-4.1 requests
  • Ultra: ~4,500 Sonnet or ~11,000 Gemini or ~13,000 GPT-4.1 requests

This means if you use up your Sonnet quota (say, 225 requests on Pro), you can’t just switch to Gemini and get another 550 requests. These aren’t pooled limits — they’re independent usage expectations for the median user per model.

So in practice, you’re limited within whichever model you’re using, and maxing out one doesn’t give you access to the others unless you’re within their separate thresholds.

Might be worth keeping in mind if you’re switching between models or trying to optimize your monthly usage.

hi @tien_ht and thank you for the feedback. Yes the examples are for average usage per model.

I wrote up also more detailed how to on Token usage with LLMs which may help with understanding and managing your plan.