I am probably missing the details but I still have not found enough information about the Ultra plan. Its probably in a comment somewhere that I have not seen.
Does anyone have a good breakdown of rate limits? I have no issue paying $200 but I won’t if I don’t understand what it is buying me. Are rate limits only hard limits, ie there is a message in the UI that states I have exceeded my limits, or are there also throttling/queues that are different than Ultra plan? Really only curious about the last part.
20x usage (than PRO PLAN) on all OpenAI, Claude, Gemini models
Today it seems there is a bug related to using claude 1-2 times and it hits the rate-limited. But the document is still a little shady. I think you should by PRO+ (60$) to use before using Ultra.
With Ultra, you will get near-unlimited usage of requests each month. There is no hard cap on the number of requests you can use, and while it follows the same rate limit system as Pro, Ultra has 20 times the rate limits, which should be incredibly difficult ever to reach.
If you were to hit a rate limit on any plan, you would be shown an alert in-app, with one of three options:
Upgrade to a higher plan
Enable usage-based pricing to pay for exactly what you use
Switch to a different model / model family
There are no “queues” or “slow requests” anymore, either, so things are much simpler.
Switching to different models or model families doesn’t work anymore. For the past couple of days, when I hit a rate limit with one model, I hit it with every model other than Gemini Flash. And yes, I can’t even use models like 4.1. Or sonnet 3.5.
We may have had an issue where rate limits were not correctly maintained across all models, but GPT 4.1 and Gemini Flash should both be truly unlimited I believe, is this not the case for you?
Now, when I hit a rate limit, and for me now, that means 3-4 prompts with Sonnet 4 non-thinking. I’m unable to use any other model other than Gemini Flash and auto.
I think Auto basically just uses 4.1 the vast majority of the time. I technically can use it, but I can’t specifically select 4.1 and use it.
This suggests you have likely used your burst rate limit already, which is large (always >= your plan cost) but is very slow to refill - this leaves you with your ‘local’ limit, which is smaller but refills every few hours.
For most users who have hit this already, we’ve found the vast majority have either changed their usage pattern, or used to rely on usage-based pricing to top up their usage each month, and should expect to continue to do so now!