GPT 5 is really bad (at least in Cursor)

arturbaltha · August 8, 2025, 5:15pm

Same for me. It takes FOREVER to take any action, probably thinking about how the universe came to be, and when it finally decides to take an action, it’s minimal and clearly unrelated to the actual problem i stated in the prompt.

lloydsilver · August 8, 2025, 5:17pm

Agree with the others. I did a code review with CodeRabbit with 5 issues over 10 files created by Claude. I had gpt5 do the updates. Super slow. And the follow up code review has 19 errors. Reverted back, used Claude, no errors.

vbxx0 · August 8, 2025, 5:33pm

Idk what you guys do but I have perfect experience with GPT5 on cursor. It’s kinda slow but it does things I need and checks for errors and fixes them by itself with nice code quality
However gpt5 is not for UI. UX only

vibe-qa · August 8, 2025, 6:52pm

This could be because the pattern of the code was familiar to Claude and not GPT, which causes repulsion to existing structure and attempt to spawn familiar environment patterns.

ALSO, HOW COME IT USES THE CHEAPEST STUPIDEST VERSION UNLESS YOU EXPLICITLY PICK THE HIGH!

MarcW · August 8, 2025, 7:06pm

So I tried again with chatgpt 5 (since it’s free right now).
It tried to modify my system files in Ubuntu without explaining why.
It spends a lot of time thinking but then it doesn’t really explain what it’s going to do.
It’s like it has no context with the workspace and so it just randomly tries things.
It’s a good example of a new model that does better on benchmarks but not in the real world.

RafeSacks · August 8, 2025, 7:10pm

Cross posting this as I think it is very important: GPT-5 Main Discussion Thread - #88 by RafeSacks

There is a serious problem with Cursor-’s GPT-5 integration, or the reasoning version of GPT-5:

Creating a wrapper around TipTap rich text editor component in React + TypeScript

In cursor:

GPT-5 high MAX

3 x 1.5 minute thinking sessions to make a minor change to a file and it made a simple but critical error where it didn’t inherit something that broke the entire implementation.

In T3 chat:

GPT-5 (non reasoning)

I pasted the whole file and the same message

instantly generated nice output with headers and an explanation of the problem and a complete updated code file that worked fine.

…I thought I was talking to Claude-4 until I checked the model I had set in T3 chat.

…I then switched to GPT-5 low reasoning and it is much faster with no obvious difference in experience or code quality so far (though token stats look about the same as high reasoning in the usage dashboard, and high reasoning took minutes versus seconds)

IMPORTANT: I did this workflow two more times of asking GPT-5 in T3 chat to fix bugs created by GPT-5 in cursor. Amazing output and perfect fixes from T3 chat. Cursor same bad results. Cursor has codebase and tests and rules. T3 chat has nothing but what I copy-and-paste. How can it be so much faster and with Claude-4 level conversational output??

arturbaltha · August 8, 2025, 8:03pm

What was the state of your codebase when you started using it? Brand new? Lots of existing code? Complex vs simple architecture? Clean?

RafeSacks · August 8, 2025, 8:09pm

Fairly large for a new MVP NextJs app with a lot of custom sections and hooks and integration with a separate backend. Well developed .cursorrules/ (dir style), etc. Claude seems to thrive.

That said, I had the explicit context minimal in this chat working on a custom component that doesn’t really use the rest of the codebase at all.

I find GPT-5 really doesn’t understand pasted code ranges. It seems to have a stale cache a lot of the time. Doesn’t seem to read images when I add them (T3’s GPT-5 has no issues reading images from what I can tell by the output, anecdotally)

TeamVMM · August 8, 2025, 10:22pm

i couldn’t agree more

Shurman · August 8, 2025, 10:25pm

ChatGPT 5 is the best model for me so far.

doonfrs · August 8, 2025, 11:38pm

The big problem now is that Cursor is using a beta model for Auto mode … and it seems to be the only option! They’ve completely ruined my day. Why would they make a model with so much negative feedback the default (and only) choice for Auto mode? I used to be happy with Auto, but now I feel like a lab rat.

Thomas_McNeill · August 8, 2025, 11:41pm

I am turning it off, it doesn’t follow rules and so far the quality of code is poor. Example:

wrong:

What I Did Wrong

Misinterpreted “Please suggest changes” - When you asked me to “Please suggest changes”, I took this as permission to implement the changes directly rather than just providing suggestions.
Bypassed the workflow - According to your rules, I should have:

Created a task file first
Waited for your explicit approval of the task file
Shown you the exact code changes I planned to make
Waited for your approval of those changes
Only then proceeded with implementation

Made assumptions - I assumed that since you provided the exception details and asked for suggestions, you wanted me to implement the solution immediately.

richrz · August 8, 2025, 11:49pm

I hope this terrible implementation we’ve experienced doesn’t run through the entire free GTP5 period. It would be unfortunate if they fixed the implementation then the period was over the next day. I hope they will extend it.

I have to run with BMAD and it actually is great. Otherwise don’t bother with just default Cursor rule set IMO.

toanxtoan · August 9, 2025, 12:10am

I’ve had the opposite experience. gpt-5 is a little better than o3 which works amazingly for me. I can’t even touch Claude–it takes charge and starts nuking my code, left and right.

matgenehr · August 9, 2025, 12:32am

I agree, I think GPT-5 is working out pretty well. Claude was my go-to, now I am preferring GPT-5-high.

Persona_Klutz · August 9, 2025, 1:22am

Mmm, you know the saying “Misery loves company?”, that’s how I feel about now. This rollout seems to be slightly improved since 0-day, but man… When you have to create a rule that is at the top of an ALWAYS file “Never run a hard reset on a git repository - this is a manual process you are not entitled to, ever!”.

Agent mode has it’s perks, but I have never encountered an Agent to-date that when you give it the simple statement “Remove x/y/z throughout the file as these are unnecessary”, it decides to do a full on hard reset to the repository to just completely nuke it. It honestly thinks this is a valid option, then gets wildly apologetic… immediately after doing it. “Oh no, I am so sorry… yada yada”. Don’t get me wrong, pretty funny, but I could certainly see that being devastating to anyone not obsessing over backups and committing code after every joint task.

Granted, I am not sure if the past couple of updates have resolved this behavior, or if the rule is sticky enough that it is no longer ignoring it. Other than that, it’s a mixed bag here. Some results are overly complex for no reason, and then others are completely unfinished. It seems to take more brain power correcting it and tying the loose ends together, than just doing it yourself.

I guess the worse this stays, the longer I have a job, so… Keep at it!

RafeSacks · August 9, 2025, 1:31am

That is how I am feeling. It has degraded to the point where I can’t get it to do the simplest thing: create a shell tool with a git commit message. I had a flow working for weeks with every model. It literally can’t figure it out. It creates PrintF commands, it slams the multiline message in to a single string. It has tried everything but what I asked it to do in an mdc file with step by step instructions and expectations.

I suspect all of the issues we see are a combination of tool use failure and bad prompting. Go try GPT-5 at t3 chat and you can see the model isn’t the issue (or at least I don’t know how much of the model is the issue)

[Update:] I had k2 set in a custom mode to run this flow but it appears to ignore the model setting now and there are no edit options. I just switched to k2 and it did the work in seconds flawlessly. I think this proves how importing using tools is to what we think is intelligence.

LARAMediaGroup · August 9, 2025, 1:59am

Same here, we find it great for large codebases, refactoring, etc.

MidnightOak · August 9, 2025, 6:24am

Wish we could disable GPT-5 from being used in our Auto requests. It’s “thinking” takes like 10x as long as anything else I’ve noticed in Auto, even with relatively simple requests like “explain this API”. Using Auto is my primary method and this greatly reduces Auto’s usefulness. I am not sure why its so slow.

Update: GPT-5 is a lot slower, but it does produce better results so far. Things that are really easy it does way too slow, but I think its because Auto is using GPT-5 thinking, which I am not used to using thinking models, so this is probably normal.

Franck_B · August 9, 2025, 7:47am

I completely agree! The switch to GPTG-5 made cursor WAY WORSE!
There is no explanation about what’s happening or why changes are made. Everything seems to be “buried” deep-down in the “thought” process that is basically useless and unreadable.

We can’t steer the model anymore and just hope that the GPT-5 “black-box” gets it right, why it sometimes (rarely) do…

Topic		Replies	Views
GPT-5 Main Discussion Thread Discussions feat-gpt-5	106	4828	August 9, 2025
🦥 GPT 5 Slow. Very slow Bug Reports version-1-4-x	5	213	August 9, 2025
Feedback on Using Cursor After Extensive Use Feedback feat-gpt-5	0	9	August 10, 2025
🤖 GPT-5 Now Available in Cursor! Announcements featured	0	6102	August 7, 2025
GPT-5 - Broken model, or broken agent integration? Bug Reports version-1-4-x , feat-gpt-5	0	13	August 9, 2025

GPT 5 is really bad (at least in Cursor)

What I Did Wrong

Related topics