Understanding Model Usage and Request Limits for Claude 3.5 Sonnet in Cursor

Could anyone explain in more detail how the usage works? I use Cursor at work, and I don’t understand this tooltip:
GPT-4, GPT-4o, and Claude 3.5 Sonnet (Claude 3.5 Haiku is 1/3 of a premium request). You may use unlimited slow requests (may run slower during high demand).

I only want to use Claude 3.5 Sonnet for code generation. Should I set it as my only model, or is there a better way? Also, how can I figure out how much a request counts? Will all requests go to Claude 3.5 Sonnet no matter if it’s a fast or slow request? If that’s true, does buying more only make requests faster without changing the limits?

Thanks!

Hey, in Cursor, fast requests are limited (e.g., 500 requests per month), while slow requests are potentially unlimited but depend on the current server load. This means slow requests might have lower priority, and there could be delays in the model’s response—usually a few seconds, but longer during periods of high demand.

You can choose to use only the Claude model if you prefer, but for simple tasks or generating documentation, I’d suggest keeping one or more non-premium models.

Purchasing additional requests will increase your fast request limit, and once those are used up, it will automatically switch to slow requests.

If you have any other questions, feel free to ask here.

I don’t see an option to restrict a specific model to certain tasks. So, if I select multiple models, wouldn’t my code generation possibly be done by a different model?

No, you can manually change the model at the bottom of the chat window.