For example, GPT-5.3 is released, and I’m still using GPT-5.2. Or Composer 2.5 is released, and I’ll be using Composer 2. (I deliberately took models with the same price as an example; let’s assume that most users have switched to new models)
So, is the quota calculated solely based on the tokens computed, or is there a coefficient based on the model’s workload in the shared pool?
When you use models through the shared API pool, billing is based purely on the model’s API token cost. There’s no additional coefficient or adjustment based on how popular the model is or how much load it’s under.
So if GPT-5.3 and GPT-5.2 happen to have the same per-token price, they’d consume your quota at the same rate. The cost is determined by the tokens in and out, priced at each model’s published API rate. You can find the full breakdown of model pricing here.
In short: it’s straight token math, not weighted by demand.