Allow Selection Between gemini-2.5-pro-preview and -exp for Max Mode with API Key

@danperks
Related Closed Issue: https://forum.cursor.com/t/models-gemini-2-5-pro-max-is-not-found-for-api-version-v1main/77308

Bug Description:
When selecting the gemini-2.5-pro-max model with a personal Google API key (in my case Tier 3 billing with Gemini):

Cursor appears to route requests to the experimental gemini-2.5-pro-exp-03-25 model, not the billable preview gemini-2.5-pro-preview-03-25 model offered by Google.

This results in hitting the “User API Key Rate limit exceeded” error very quickly:

This rate limiting is inconsistent with the high limits expected for the -preview model under Tier 3 billing (which should be 2,000 RPM / 8,000,000 TPM, Ref: Google Docs), but it aligns with the lower limits of the -exp model.

Users with billed API keys need the option to use gemini-2.5-pro-preview-03-25 with Max mode to utilize their purchased rate limits.

Suggestion: Explicitly offer both

gemini-2.5-pro-preview-03-25-max
&
gemini-2.5-pro-exp-03-25-max

in the model selector.

How to Reproduce:

  1. Use a billed Google API key (Tier 3 tested).
  2. Select gemini-2.5-pro-max model.
  3. Make several requests quickly or use larger context.
  4. Observe premature rate limit errors.

OS and Cursor Version:
Windows 11, Cursor version: 0.49.3

Blocking Issue:
This prevents users with billed API keys from reliably using Gemini 2.5 Pro Max due to hitting the lower -exp rate limits instead of their expected -preview limits. I have to keep waiting to use the max model with my own API key. When I use the gemini-2.5-pro-preview-03-25 model directly via its API (outside of Cursor), I never encountered these rate limits and the codebase was 3x the size.

6 Likes

I posted a related issue/feature request and will link here so that the issue is a little clearer: [REQUEST] agent support for gemini-2.5-pro-preview-03-25 via API keys

I requested agent support for -preview since that currently doesn’t exist. It looks like we need both routing for -max to be able to go to -preview and for there to be agent support for -preview. Right now it seems like there is no full integration of -preview even though the model is nearly the same as -exp, which does have agent support. I’m hoping it’s just a config change on Cursor’s side.

2 Likes

Yes I agree. And even if they provide agent support for the preview I still don’t think it would use the full 1m tokens unless “max” is used.

2 Likes

Hey, just flagged this to the team to look into, thanks for picking up on this!

3 Likes

Thanks Dan, should be an easy implementation and can hopefully be done today.

2 Likes

Thanks for posting this ticket - I was coming to follow up on my original thread to post exactly this. :+1:t4: :+1:t4:

My cursor was updated today to Version: 0.49.5 but still no sign of the fix for Gemini

What’s the purpose of using your own API key ?
Are you able to have a longer context without the max ?
I’m wondering if it is better to do this than enable the usage-based pricing.

The main reasons are as follows:

  • The exp and preview models have different rate limits, with the preview model offering greater scalability.
  • Since the exp and preview models also differ in cost, maintaining flexibility benefits a wider range of users.
  • Making requests directly to Gemini is significantly cheaper than going through Cursor, creating a strong incentive to bypass the max mode.
  • For users who, due to security or billing constraints, have no option but to access the Gemini API directly, this functionality provides a way to adopt Cursor while still staying within those limitations.
1 Like

Hello team, I have my own api key fom Google, I’m getting this error:
image
This is because a cursor error or I need to fix it o Google (paying to increase the limit?)

Hey, unfortunately we take heed from Google on which endpoints and model versions to use under the hood, but as far as we know, the rate limits are not too dissimilar, so the experience with the other model would be the same.

Besdies overriding the OpenAI endpoint URL, I unfortunately think there may not be any changes here.

@superjavi This is a Google error due to high use of your API key, and there is no way to bypass it unfortunately, just got to wait until you are no longer rate limited. You can learn more here:

1 Like

@danperks when will you allow us to use the preview model ?

Hey, this should already work with your API key.

Yes it’s not using the preview model it’s using the exp model.

Despite previous discussions and multiple requests, it seems the issue is still not fully understood, so I’d like to clarify it once again.

  • Gemini 2.5 model comes in two variants: exp and preview.
  • There are differences between these two models, such as API rate limits and costs, which is why there is a clear need to use both depending on the situation.
  • Currently, there is no feature that allows users to switch between exp and preview models. That is why we are specifically requesting this functionality.
  • Simply responding with “exp is supported” does not address our request and does nothing to solve the underlying issue.

I hope this clears things up and that we’re now on the same page moving forward.

3 Likes

You made the point perfectly clear.

How long do you think till @danperks updates it so we can use the preview model?

2 Likes

It is hoped that the preview model can be added as soon as possible because the preview model has long been provided for free in google AI Studio.

dying on this … keep hoping it just gets fixed but so far no luck

Hi, flagging to the team again to see if there’s anything we can do here.
Unfortunately, we mainly lean on Google’s advice regarding capacity and stability for normal usage, and the use of API keys has to be secondary to that.

1 Like

Thank you for taking this issue again.

In my own experience, the gap between the “preview” and “exp” models is not small.
It’s not just a matter of rate limits—the differences spill over into API stability, tool execution, and the quality of long‑context, multi‑file processing.

A friend of mine has a (somewhat obsessive) routine of sending the exact same request to the LLM endpoint every day, and his text‑mining metrics show a clear decline in response quality. It makes me wonder whether Google is simply unaware of how rough the API has become—or, less charitably, trying to gloss over the problem.

In any case, most of these headaches would disappear if the endpoint tied to our API keys were switched to “preview” model. Even better would be a way to toggle freely between “preview” and “exp.”

Thanks again to the support team for your help.

3 Likes