How you all choose which models to us?

hwisu · April 25, 2025, 4:31am

I’m curious how you all choose which models to use. These days, I use GPT-4.1 for simple questions and Gemini for complex ones. For agent mode, I uses o4-mini for straightforward tasks and Sonnet 3.7 for difficult problems. I used to run everything on Sonnet 3.5, but it’s become too slow, so now I switch models everytime. Personally, I wish I could preset which model to use for each mode to cut down on exploration overhead.

I also tried the “automatic” mode, but it doesn’t always run faster, and it doesn’t produce better results than picking the model myself—so I stopped using it.

TP511 · April 25, 2025, 5:16am

I always look at what the current top model is (Looking at benchmarks from Aider LLM Leaderboards). For every coding task I use gemini right now through my api key and also tell it to use as many tokens as it wants to get the best performance. For general questions I use Sonnet 3.7 or O4-Mini, so like the models that dont cost/use anything of my own ressources.

TheSingular · April 25, 2025, 5:49am

How do you use it through API key? Do you get rate limited to 25 RPD with it? If not, how?

TP511 · April 25, 2025, 6:04am

I don’t know if it is a bug, but I get constant rate issues with the cursor models. I went onto the Gemini Page and added a model with the name ‘gemini-2.5-pro-preview-03-25’. Since I did that I can prompt without a problem with my API key. From my understand this is the official model name from Gemini. Here is the API Doc: Gemini models | Gemini API | Google AI for Developers

TheSingular · April 25, 2025, 6:08am

Probably because cursor still uses the experimental model which is the free and severely rate limited(5 RPM, 25 RPD) counterpart of the preview model. Can you use tool calls with the preview one? Can you enable thinking on it?

TP511 · April 25, 2025, 6:23am

I haven’t tried tool calls. You can’t enable thinking on it but I think that it uses thinking by default but Cursor doesn’t show the thinking process since it’s not one of it’s one models, but that’s only a assumption by me.

loubacker · April 25, 2025, 6:29am

I’m literally doing the same thing. To be honest, when I read what you just mentioned, I was shocked, because I’m literally doing exactly the same thing and it works perfectly. This is the perfect combination. Good job, mate! Cheers to us.

bulabula · April 25, 2025, 8:34am

emmmm…sonnet 3.7 for anything..

chug2k · April 25, 2025, 11:06am

topic here: Allow Selection Between gemini-2.5-pro-preview and -exp for Max Mode with API Key - #2 by concord.lacunae

hope they fix it soon!

adamwuyu · April 25, 2025, 11:20am

It is said use models added by user can not use full features of cursor. Is this true?

TheSingular · April 25, 2025, 12:11pm

Yeah, this is correct. Agent mode will not automatically work. It might create a code block you can then add to a file(or create the file if it doesn’t exist), but you’ll have to manually trigger it. It can’t just edit the files itself, nor can it search your files or access any part of them unless you include them into the request, and it can’t search the web either.

Basically you’re limited to the manual mode with such models.

adamwuyu · April 25, 2025, 12:17pm

Thank you for your info. It is very important to know

TheSingular · April 25, 2025, 6:16pm

You’re welcome.

ewzxyh · April 25, 2025, 7:50pm

gemini for anything, if doenst work i use claude and the last option is always openai

waan1 · April 26, 2025, 7:34am

I start from “automatic”, whatever cursor choses. as things become more complex, I switch to gemini.
in my experience gemini is faster and sometimes smarter than sonnet, but it is not as solid and stable as sonnet 3.7. I usually use gemini until it becomes lazy or dumb, then switch to sonnet 3.7. that almost never becomes lazy or dumb. probably it is due to inference budget manipulations on google’s side.
I only use thinking models, as non thinking are much worse.
when / if I’m stuck, I use o3 to dig deeper. I might use o4-mini, but only high version, as standard o4 does not show anything outstanding that would win over gemini or sonnet.

FirefoxMetzger · April 27, 2025, 7:18am

Hot take: It doesn’t matter beyond how much $$ you pay.

Yes different models have different performance on benchmarks and some “feel” better than others. However, we are talking small % improvements on artificial benchmarks that are currently under scrutiny on the count of being gamed by all the major LLM providers. “Feels better” may be a factor; it’s up to you how much you choose to rely on your feelings or the feelings of others in choosing models.

The main differentiator is along broad categories that are not specific to any model:

can the model reason? (RL-based fine-tuning aka “thinking” models)
how large is the context window
how are you paying for the model (free, premium, consumption-based)

The choice I make is as follows:

For “Ask” I use a free model (cursor-small, gpt-4o-mini). Reasoning models are fine-tuned on code and math problems (not “fuzzy” brainstorming) so unless you ask the model for napkin math (don’t, you risk hallucination ) it won’t help you and you pay premium to use them here. The allotted input context window is about 20k tokens across all models (enforced by cursor) so that factor is irrelevant here.
For “Agent” I currently use claude-3.7-sonnet. You can also use any other reasoning model (o3, grok-3, gemini-2.5, …). We want to generate code and basically every paper and benchmark under the sun shows a clear and significant performance boost for “reasoning” models. I try to stick to premium models instead of models priced via consumption (I have infinite ambition, but a very finite wallet ). Input context is, again, limited to 20k for chad and 10k for cmd+K completion so no differentiator here.

The piece I am currently experimenting with is “normal” reasoning vs MAX models. It’s significantly more expensive, but we are getting 2x context in the model itself and file references pass up to 750 lines instead of 250 lines. I think this changes how you can structure your codebase, but it’s too early for me to have a strong opinion here. Let’s see what the community discovers

chrishoward-au · April 28, 2025, 2:58pm

I must be misunderstanding your statement, because you can preset which model to use in each mode. So, what are you really asking for?

eurasec · April 29, 2025, 8:47pm

Today was the first time I even heard about gemini-2.5-pro-preview-03-25. I decided to test it out on my Web3 Python project, and honestly, it’s way better than Sonnet 3.7 for what I’m doing. Way sharper with code context, and doesn’t get lost on longer async workflows like Sonnet sometimes does.

If I hadn’t stumbled across this post, I would’ve stayed clueless too lol.

Maybe it would actually be smart to run weekly polls or something — like “Which model helped you most this week?” or “Best model for coding / bugfixes / refactoring?”
Could be a good way for everyone to discover underrated options.

zyc-bit · April 30, 2025, 2:25am

Can’t agree with you more.

BooN · April 30, 2025, 8:28am

Wow thanks, was not aware of this model, so far in my problem solving tests, its pretty decent. I wish we could plug these properly into Cursor with all the tools in functionality, but least can chat with model inside cursor for now until they properly integrate it into the options.

At this point paying for all the premium models, cursor, and a additional tool like cursor that is online only, coding with AI is becoming a quite an extra expenditure per month.

It does save me time from coding everything manually, but I end up paying for that saved time. And then spending additional time jumping between models on both cursor and their respective platforms to see where dumbing down is coming from when a model suddenly loses its performance mid coding. If its from cursor, i have to manually add the files and copy and paste the questions over to the platforms. If it’s the model itself, have to jump to other models to see which one is working fine and which one drank too much dumb dumb juice at that moment.

At least taking these factors into account, developers/programmers aint going anywhere in the forseeable future. In fact they are more needed now in my opinion because they can direct the AI properly and the sometimes fix their baffling mistakes. And problems that come from pure AI vibe coder problems I had the displeasure to fix a few times.

Topic		Replies	Views
Gemini 2.5 vs Sonnet 3.7 vs Grok 3 vs GPT-4.1 vs GPT-o3 Discussion	12	7093	April 20, 2025
Model selection suggestions Discussion	1	112	December 24, 2024
🥰 [share] Which models do you use usually recently? Discussion	5	352	April 29, 2025
The model's response doesn't match the model I requested Bug Reports	9	372	April 27, 2025
Performance of Gemini 2.5 Pro Models in Cursor Discussion	9	1151	March 28, 2025

How you all choose which models to us?

Related topics