What's your go-to model in Cursor? A frontend dev's take on Gemini 2.5 Pro vs. Claude 4 vs. GPT-5

Hi everyone,

I’m curious to know what AI model you’re all using in Cursor and what your experiences have been.

As a frontend developer (React, Typescript, Next.js), I’ve found that Gemini 2.5 Pro works best for me. I switched from Claude 4 a while ago and haven’t looked back. In my opinion, Gemini 2.5 Pro seems more intelligent and, most importantly, follows my instructions much more accurately.

A key factor for me is its ability to be critical of my ideas and approach. When I’m focusing on clean code, solid architecture, and best practices, Gemini provides the critical feedback I need. We collaborate, and it challenges me when necessary. With Claude 4, I often felt it was too quick to “rubber-stamp” my ideas, even when I explicitly asked for critique. It seemed to struggle with providing truly critical feedback.

Regarding the GPT-5 family of models, I was really hoping for a significant leap in quality over both Gemini 2.5 Pro and Claude 4, but I’ve been deeply disappointed so far. My main issues are:

  • It’s extremely slow. The waiting time for responses is just too long for a daily workflow.
  • It doesn’t always follow instructions as precisely as the other models. It sometimes does more work than I ask for ( which was also the case for GPT-4’s and Claude 4 sometimes)
  • I’ve encountered a strange bug where it gets stuck in a loop, repeatedly trying to end the conversation.

Because of these problems, I find the GPT-5 models unusable for my day-to-day work at the moment. I just don’t have the time to wrestle with the model to get what I need. I should add that I’ve read the latest OpenAI Cookbook for GPT-5, so I’ve been mindful of providing clear goals and instructions, but it still falls short in my experience.

So, what are your thoughts? What model are you using, and how has it been working for you?

P.S. This post was rewritten with Gemini 2.5 PRO and Grammarly AI for clarity, but I’m still a human :wink:

4 Likes

Interesting feedback.. Maybe I should try that Gemini too :thinking:

I’m currently using Claude Sonnet 4. Would love to use Opus4.1, but is so expensive and I have not won in lottery yet. My experience in past has been, that claude is only one that can actually finish the work very well without huge problems or lot of boilerplate code here and there. That is why I switched at the time from gpt, and I have also felt same obstacle with Gemini in the past.

I have done some brainstorming with Claude too. I explicitly need to command it for that. I hardly even find it asking anything on its own. That would be nice to have. But gpt5 did go to the other end, where it could not finish even simplest thing without asking if it should do it better.

I’m coding in python and I only use GPT since some time ago, o3 → gpt5. Gemini adds too many weird comments and doesn’t really work for me, claude is good but it’s too exhausting to work with it - it’s adding too many unasked features, so I stopped using it after a few weeks since sonnet 4 release. I rarely add big features at once, so instrumental edits with GPT is my go-to. And debugging, it was better at that for me.

1 Like

Doing a lot of web UI/UX, backend API, etc. my daily goto for this task is Claude Sonnet 4.

GPT 4.1 for me, have been primarily using it since it came out. It’s reliable and does good work provided you feed it good prompts. A favorite feature is that it never makes changes without being told explicitly to do so, even in agent mode, and it’s very good about allowing me to review intended changes before it starts implementing.

I work on a variety of codebases:

  • Old, monolithic Rails
  • Modern React Native
  • Laravel 10

Every now and then if I have a particularly complex problem that GPT 4.1 and I haven’t been able to solve then I’ll try other models, with mixed results. I also usually test drive new ones when they come out, but so far I always gravitate back to good ol’ 4.1.

2 Likes

I code with Flask using mongodb. So that means HTML and some javascript for frontend. For me I mainly use Sonnet 4 Thinking Sonnet 4, AUTO, and Gemini. I notice that creating frontend designs with Sonnet is not very good. It always gets colors wrong and it’s layouts are not very good. AUTO and Gemini on the other hand is very good so I usually use one of those for frontend designing and coding. For documentation, md files, etc.. I’ve tried GPT 5, SONNET 4 THINKING and AUTO. I notice AUTO is the best with documentation. But for Coding python, I really like Sonnet THINKING. I usually use Gemini at times when something isn’t going right or I want to try new things but I have been mainly using SONNET THINKING for coding, AUTO for frontend 80% of the time(Gemini 20%), and then AUTO for documentation, MD files, etc… GPT5 is way to slow for me. But now that you said you use Gemini most of the time I will def try it out more often!

gpt5-low seems to work ok.. low-fast if you want it faster. I agree with most of this, each model has different capabilities. Learning what to use each one for is a lot of work. Claude is a pretty well rounded model that can do everything but the eagerness needs to be turned down a bit, which i think cursor has to do in their api call, we can’t do it.

Gemini 2.5 pro does work well for frontend, unsuprisingly so since it’s google, it’s what they do most of.
GPT5 is very robotic but it doesn’t go off and do lots of other things, while it’s a bit slow, it seems to be better at targeted changes.

Grok4-code-fast (i forget the exact name) works fairly well and it’s insanely fast , but does make some mistakes and doesn’t have a large context window compared to claude(max) or gemini(max).

I seem to use them all together, like if i’m not really happy with an answer from one ill open a tab with another.
I hope one day cursor has an agent that will send the prompt to several models at once and weight and check the results, even have them talk to each other to make a better result. They are like a team of developers with each a different personality and different ways of looking at things.

I do find myself always gravitating back to claude4 max though, it just is the most well rounded if you can get it to curb the overenthusiasm. Sometimes i use claude in the actual claude ui or claudecode instead of cursor because it acts differently and then i use that output to give to the cursor session (same with i do gpt5 in the gpt account online and then have it make prompts for the cursor one). It saves $ overall to have a few accounts I guess

1 Like

Do you suggest that GPT5 is just “updated” o3? Curious since there was so much hype about GPT5, but for me it’s worse then GPT-4.5 for coding (mostly because of the speed)

Yeah, the most irritating thing is that you need to spend time to actually learn and figure out how different models work and where they thrive. And I don’t want to act like an ignoramus - I know that prompting technique is important and I took prompting courses in the past, but hey I dont want to have to think about adapting my prompt techniques every time I use different model.

In the day-to-day work, I need to be efficient and iterate quickly without having to think which model I’m gonna use for that particular case.

That’s why I find Gemini works best for most of my cases as a front-end developer working with Typescript and Nextjs code.

1 Like

And you don’t get any tool call issues with Gemini?

Please share your rules or prompts regarding critical feedback in Gemini. In my experience Gemini really struggles with critical thinking (just like most of other models). There’s been discussions about it here and others shared similar sentiments, so if you’ve solved this, it’s worth spreading.

I use gpt5-high as full replacement for o3, and it does everything I need. And gpt-5 for simpler tasks, as well as occasional free qwen-coder calls through roo-code when I don’t wanna spend tokens. Keep in mind, most time I’m not really writing lots of code, as MLE/DS I’m mostly designing experiments, changing small parts of models, debug things etc, so what I need vastly differ from typical frontend/backend coding.

I don’t have any fixed prompts or prompts patterns for that.

Most often I prompt Gemini like this:

  • I share my idea for implementing the stuff

  • I share my concerns about that idea, especially in terms of maintainability of the proposed solution, (quite often in the prompt I mention “clean code, clean architecture or DDD as sort of “golden standards” that I wish my code could follow,

  • sometimes I explicitly mention some granular code patterns that I considered using eg. “I don’t like these ‘if else’s, I was thinking about maybe using switch statemets instead for clarity and bettter code maintainablity”

  • I ask Gemini for feedback about my “idea” and tell the model to criticize if needed and propose a better solution based on the best software architecture patterns out there (here I mention clean code, clean architecture, DDD),

  • Quite often I will say - please first come up with a feedback on my plan, don’t touch the code yet,

  • Gemini would revisit my idea and come up with either smth completely different, like in that example, instead of multiple IFs - strategy pattern, or it would say: point X, Y from your plan is good, but Z could have been done better, etc…

In the next step I will revisit Gemini’s idea - make changes if needed and then ask it to execute step 1 of the plan. Then I will work along Gemini over the step 1, once again few rounds of prompt until I get to the actual solution. Then I come back to step 2 of original plan etc.

What’s important, quite often I would execute those steps in a separate chat tabs and then get back to the tab where the idea and the plan were discussed.

That is my approach. And even though it might sound complicated and time-consuming, it’s actually not, since I dictate all prompts in my native Polish language using MacWhisper and ElevenLabs voice model.

The general outcome from my approach is - I’m always the driver, I never let AI do everything for me, from a general description. That was my approach in the past, but I got a lot of negative feedback for my code from my peers, and now I have completely changed my approach to be more mindful while using AI, read and understand “most” of the code and more important - shape the architecture of the solution with AI and not let AI design the solution itself.

1 Like

Honestly, I feel like all these “latest big models” are kinda overhyped. Most of them either dump too much info or wander off track. Claude 4? Feels rushed, like it wants to answer fast but doesn’t really give you the best answer. GPT-5? Shockingly, I actually find it weaker than 4.1 — which was more balanced and thoughtful. And then there’s Gemini 2.5… low-key, I think it’s the best one out right now. It just feels more consistent, less chaotic, and actually useful throughout

1 Like

I think we get different experiences depending on what we are doing. Sometimes one model is way better at fixing one of my issues than another. It’s like talking to different people. Some people will know one subject better than another even if they have the same type of training, it’s just how they are. SO the models are the same. Find what works for you and use it and get used to it. Unfortunately we have to do this per-project and it takes a lot of time.

I do a lot of testing and the results are just vastly different, it’s not consistent so it’s very dependent on the end user and the codebase and the conceptual idea of the project itself.

1 Like

Claude 3.7 over Claude 4 – Grok4 beats Gemini - Gpt5 all models very good

They all are only as good as your prompts and rules.

Anthropic has some ketch-up requirements to fill

That’s exactly what I am doing, Gemini 2.5 Pro is more consistent in providing the exact output I want with little to no deviations, also the memory retention and call backs is just superb, my experience with GPT 5 is really bad in core development, it’s good in layering out ideas but implementation from start to finish is bad, it gets distracted from the main play and pay more attention to side quests

Cool to hear that there is at least one Gemini user :joy: I fele like that model is very underrated by the community and GPT’s and Antrophics are hyped to much

I’ve been using Claude Sonnet 4 mainly for coding tasks and Claude Opus 4.1 for system design. From my experience, GPT-5 clearly outperforms Sonnet 4 and is on par with Opus 4.1. The issue is that its integration with Cursor still feels immature or possibly buggy, which prevents GPT-5 from reaching more than about 60–70% of its real potential.