I would like to thank Cursor for the exceptional Auto mode performance.
The quality is now indistinguishable from Claude-4, making me forget other models exist.
I exclusively use Auto mode. Working with Flutter, despite its complexity, has become seamless.
I have also found it to be exceptional, until today. For the entire day, every response has taken a minimum of 10 seconds and upwards of 30-50 seconds. And the quality of the responses has been garbage! I sure would like it to perform how it has performed every day until today!
I wanted to offer a different perspective regarding the waiting times youāve been experiencing with auto mode. From my observations, Iām starting to believe that the issue might be connected to exceeding a certain command limit. Specifically, I suspect it could be related to hitting a 500-command threshold. It seems plausible that once this particular limit is reached, thereās a noticeable and consistent decrease in performance, leading to the longer response times youāve described.
It might be worth investigating if your usage patterns align with this theory, as it could help pinpoint the cause of the recent performance degradation.
Iām still pretty new to Cursor. What do you consider a ācommandā? 500 over what time frame? Today Iāve only been trying to chat with Ask/Auto to set up a few rules for a brand new project. Not even a single code file yet.
With GPT-5 now available at a great price, I think they moved from GPT-4.1 to GPT-5. The output costs a little more, but the lower input cost makes a big difference for them, which is why itās so much better.
In your screenshots to the right of Agent^ is another selector set to o3^. If you click on that second selector it should give you a toggle to turn on Auto.
I am actually quite a fan of NON-Thinking models! I used Claude 4 Sonnet thinking for the first few months. The whole āthought processā stuff seemed really cool.
Then I learned that thinking models cost twice as many requests. Further, they seemed to go off doing more of their own thing, than non-thinking models as well. (ESPECIALLY Gemini at the timeā¦man that thing is a hyper opinionated bulldozer, and it doesnāt like Sonnet code!)
After working with Cursor for a while I learned the value of planning, so I would first ask the agent to generate a plan and report to me then wait for further instruction. Iād refine the plan a bit before finally telling the agent to enact the plan (or more often, enact it one phase at a time, I now even have rules about how plans should be created in a multi-phasic fashion, etc.)
I realized that by planning, AND using a thinking model, sometimes the two were at odds, AND I was RIPPING through requests at an insane rate. So I switched from primarily using Sonnet thinking, to Sonnet non-thinking. That seemed to give me a lot more requests, and I honestly found the results more acceptable, not to mention faster (a non-thinking model just does, it doesnāt wast time pretending to cogitate while burning tokens needlessly.)
Iāve been using gpt-5-fast today, and I am sitting and waiting a LOT more as the thinking does its thing, and wastes my time (while hoping it really truly is FREE right now!) when I figure Sonnet would have been done already. This model is pretty fast, so it seems to move more quickly than Sonnet when it actually acts, but sometimes the āthinkingā process can roll on for a good while before it actually acts, and when I already have a plan of action I want to follow, the thinking is really quite annoying and seems quite uselessā¦
If they can offer a thinking mode for free this would be HUGELY HELPFUL. Have option for āthinking autoā and ānon-thinking autoā. Shouldnāt have to be forced to pay to have a thinking model given GPT-OSS and QWEN!
I also found Auto model way more useful comparing to even recent experience. Iād like to stick to it as much as possible, since itās include into plan and with new pricing I very quickly reach the limits.
My biggest concern is performance. Sometimes 100 line of code is generated 5 minutes. I also experience hang-out request, which force me to stop the current one and prompt again;
Kudos to the team and I hope they address the performance issues soon!