I am currently using Cursor Pro, but the monthly limit of 500 Fast requests for Premium models seems insufficient for my needs. The billing questions I want to clarify mainly include the following:
After I exhaust my 500 Fast requests quota, will the Premium models switch to slow requests? At that point, can models like Claude 3.5, Claude 3.7, and OpenAI o1 still be used normally, just with slower response times?
What are the main differences between slow requests and Fast requests? Does the response speed become significantly slower? If so, how much slower will it be? Can you provide an estimate of the slowdown?
In Cursor’s model settings, I can input my own OpenAI API Key and Anthropic API Key. After entering my keys, once the 500 Fast requests quota is used up, can I continue using Premium models’ Fast requests by consuming my own API quota? Furthermore, would I still be able to use agent mode and Edit mode with models like Claude 3.5, Claude 3.7, and OpenAI o1? I know that non-Pro users who input their own API keys can only use chat mode.
There’s a new option in the Cursor account called “Enable usage-based pricing.” Is this similar to the method of entering my own OpenAI API Key and Anthropic API Key? That is, after exhausting the 500 Fast requests, API calls start being used—but one option involves paying Cursor, while the other means paying OpenAI or Anthropic directly?
I look forward to your reply to resolve my confusion. Thank you!
Yes, after you exhaust your quota, you’ll be switched to slow requests, but all premium models will still be available to you.
The difference between fast and slow requests is that they are moved to a slow pool. Wait times in the slow pool are calculated proportionally to how many slow requests you’ve used.
Your API key does not affect receiving fast requests from your premium subscription. You will also be limited to chat only, but other premium features will still work for you.
Usage-based pricing is a per-token billing system, and it differs slightly from using an API key. If your API key calculates token usage for input and output, here one request will cost 4 cents, regardless of how many tokens are used in your request/response.
You can get more detailed information from the documentation:
That’s very useful, but after reading that article I’m still not clear about this part of the OP’s question:
“After entering my keys, once the 500 Fast requests quota is used up, can I continue using Premium models’ Fast requests by consuming my own API quota?”
You replied that using an API key in addition to a Pro subscription won’t affect fast request quota… Does that mean that I can add my Anthropic API key and it’ll only be used when my Pro subscription fast request quota is used up - but it’ll only be available in Chat?