[ask] Gemini 2.5 is support 1M contex?

I just went through the documentation and noticed that other models only support up to 60k tokens.

However, we know Google Gemini 2.5 Pro supports up to 1M tokens. Is this a limitation in Cursor, or is there another explanation?

8 Likes

I would love to know whats actually the context window of Gemini 2.5 pro-exp 03-25 VS gemini 2.5-pro max, and if it would be great if cursor team actually adds this to the documentation before releasing stuff.

4 Likes

Ah, they just launched a new version that supports Gemini Pro Max
haha, I bet itā€™s similar to Claude Max.

Iā€™d highly recommend using OpenRouter and Cline if youā€™re planning to work with large contexts.

1 Like

Cline is too expensive

1 Like

yesterday Iā€™m using it with Openrouter for free
you can try as well

How did you use it for free? The API or the App?

I would really like the cursor to stop nerfing the llms serious context indeed this is boring

Iā€™m using the studio there because there is 1m and itā€™s free and complicated why do this?

1 Like

hi, there.

Gemini API is free for now (experimental).

Looks like Cursor updated their page on context window sizes: Cursor ā€“ Models

According to their docs, it seems only their ā€œGemini 2.5 Pro MAXā€ setting actually uses the 1M token context. Thatā€™s quite surprising.

Does anyone know if the Google API bills differently based on the requested context size itself? Like, is a request set up for 1M tokens priced differently than a request set up for 128k tokens?

Iā€™m trying to avoid using Cursorā€™s ā€œMAXā€ option because of the $0.05 per-action cost. It charges you even for simple things like reading a file. The total cost from these small actions seems like it could add up to be much more than the actual Gemini 2.5 Pro API call would cost directly.

I agree with you that the $0.05 per tool use is quite high. Is there a different way that part could be billed? Perhaps could the initial request price be increased?

Using ā€œFreeā€ is great for testing yes, but if you are working on a professional be careful that your prompt might be used for future model updates.

1 Like

Yes, Googleā€™s APIs (AI Studio and Vertex AI) are charged differently depending on the amount of tokens given and generated. Thereā€™s no ā€œrequestedā€ context size. Itā€™s just charged by how much you use, so a 128k request (actual used) would be different than 1mill request (actual used).

Googleā€™s API charges per token:

  1. AI Studio API: Gemini Developer API Pricing  |  Gemini API  |  Google AI for Developers
  2. Vertex AI API: Vertex AI Pricing  |  Generative AI  |  Google Cloud

For example, Flash 2.0 in AI Studioā€™s API, is $0.10 per 1mill tokens input (itā€™s priced down to each token), and $0.40 per 1mill tokens output (itā€™s priced down to each token).

Pro 1.5 in AI Studioā€™s API costed up to $2.50 per 1mill tokens input, and up to $10 per 1mill tokens output.

I would expect Pro 2.5 to be between $1 to $3 per 1mill tokens input, and $5 to $15 per 1mill tokens output.

So for it to be $0.05 per tool use, and each tool call can have up to 1mill tokens input, thatā€™s cheap. Youā€™re talking about 50k input tokens (actual use) max (@ $1/mill input), or some smaller amount if you also take into account the output tokens.

2 Likes

Regarding tool usage and context limits, I suspect that tools donā€™t each process the full 1 million tokens independently. Itā€™s more likely they use smaller models for pre-processing and then make one request to the core model. Am I wrong?

If so, the current $0.05 per-tool-use pricing seems counter-productive for users wanting to leverage the full context window of models like Gemini 2.5 Pro. Current approach forces users onto usage-based pricing, which can become significantly more expensive.

A better approach might be to introduce a higher subscription tier, maybe an ā€˜Ultimateā€™ plan, that includes full context access. While the base $20 plan might be insufficient for heavy usage, tiered options would offer more flexibility and predictability than the current per-tool fee combined with usage-based pricing for large contexts.

1 Like

They still include the conversation history, and some codebase context, per tool invocation (presumably). Itā€™s not really about the ā€œtool callā€ itself, but that when these LLMs do a tool call, itā€™s actually the end of an existing LLM invocation that is asking the Cursor system to run a tool. So when the tool is run and finished, they have to resend the convo history, context, and tool call result, so that the LLM can ā€œcontinueā€ where it left off. Each tool call can roughly be equivalent to an LLM invocation or new user message.

However, that all being said, I usually donā€™t see it using more than 50k tokens (for me at least) per message or tool call. And the context only grows big enough for MAX to ā€œmaybeā€ matter when it starts prompting you to create a new chat :sweat_smile:.

I donā€™t see much benefit to MAX for me. I donā€™t have convos that last long enough for it to matter. I am using Cursor on codebases that are several hundred thousand lines of code, but itā€™s never pulling in the entire codebase, even on MAX. Iā€™m usually using one chat per feature. Yes, itā€™s a higher max context limit, but larger context isnā€™t necessarily better either (see MRCR benchmarks and others). I think they should instead offer to switch to MAX when it needs more context (giving that option to the user), but always start from the standard version.

The higher subscription tier is something I definitely would agree with. I wouldnā€™t use it, but those that want to try to maximize the context might like that.

3 Likes

Hey, Gemini 2.5 Pro matches Claude 3.7 in having 120,000 tokens as standard, but can access the full 1M tokens in Max mode!

The Models page of our Docs is now updated to reflect this!

To @decsters01ā€™s point, we havenā€™t nerfed or downgraded any of the LLMs specifically - the context window has always remained consistent, and has only increases as we have improved Cursor and itā€™s interaction with the models.

As @ThomasT says, while Google are offering Gemini for free via their API for individual users, they will likely use your data to train Gemini 3 on, so for now you do get that benefit with Cursor.

We are looking at the pricing structure for these models moving forward to see what is simple, sustainable and fair, but nothing changing yet!

1 Like