Max Mode uses significantly more context per request, which is why you’re seeing higher usage than expected. It’s designed for situations where you need the model’s full capabilities and aren’t particularly cost-sensitive. Max mode will consume usage much faster than non-Max mode.
If you’d like to preserve your usage for more requests, consider sticking with regular (non-max) mode for most tasks.
It’s just said that I didn’t get a warning and that I’m already done for the month. Just by a small mistake . Just can’t believe the usage, i know its maybe more context, but if on other scenarios we are hitting above 1M tokens, with just 1 request. So seems too high
My record is 120.8 requests in a single prompt, I can only use max mode when I’ve burned through my included request and it is counting against on demand usage.