Hey, thanks for the feedback here!
We’re working on updating our docs with all the relevant details and answers to your questions.
To summarise so far, rate limiting should be pretty dynamic. The models you pick will continue to have a “weight” to them, which will effect how quickly you get close to your rate limit - more expensive models will get your there quicker.
The system should be resident to bursts of usage as well as usage over days, weeks and a full month. Like a health bar in a game, it will refill over time.
Based on our analysis so far, we think the majority of users shouldn’t see a rate limit often in standard usage!