GPT-4.1 In Auto Mode

I’ve been running Cursor Auto and I’ve got a rule to make the model clearly state its name at the start of each answer. So right now, are you guys actually giving us GPT-4.1 in Auto? Used to be Sonnet or at least Haiku (or gpt5), but now it’s just this junk. Honestly, that says a lot about how much Auto mode performance has dropped lately.

4 Likes

agreed++

Hey, such rules aren’t accurate, models often hallucinate those. For example DeepSeek would say it’s actually OpenAI etc.

Auto is just a router that gives you one of premium models depending on what could be best for current task etc.

2 Likes

GPT 4.1 is one of the best models if you know how to write prompts and provide context well. I’ve been using it as my daily driver since it came out.

1 Like

True, GPT-4.1 is pretty great, and it’s built to handle context really well.
Given that the task here is generating a commit message, it could actually do very well.

When I ask:

I have been using Auto for a day or so (getting close to my monthly usage limit), and its performing pretty well so far.

Nah, that’s not true. I’ve got the same rule and it works correctly like 90% of the time. The only exceptions I’ve seen are Sonnet 4.5 occasionally claiming it’s 3.5, or Haiku sometimes saying it’s Sonnet. But OpenAI models always report their names right, so yeah—it’s definitely GPT-4.1.

Also, Auto doesn’t actually pick a model based on the task, it just gives you one generic model from the pool for everything. Sure, GPT-4.1 can handle stuff like commit message generation no problem, but for a lot of other use cases it’s just trash. The model’s the same no matter what you’re doing, and that’s really the point of the original post.

I assume you’re using Auto to decrease or completely offset the costs.
Does it really decrease the cost, or you’ll achieve the same by manually selecting gpt-4.1 for example?

Its Auto mode. Auto is not a “model”, it is a MODE. It is purposely designed to switch between models according to the load on each model. If Sonnet is overloaded, it will switch to another model. As far as I have gathered from the docs, it can use ANY model. There are zero guarantees with which model you’ll get when using Auto, but that is the entire point…its designed to balance load.

Since Auto is no longer free…there is really little point in using it, if you prefer a particular model. You will be billed the same when using Auto as with any other model.

The SOLE exception is if you were grandfathered in by having an existing pre-paid plan. However even with that case, Auto is a MODE, and its express purpose is to allow Cursor to use different models behind the scenes to distribute load.

This is expected and documented behavior. :person_shrugging:

Doubt you will get the true models used when in auto mode, for several reasons. One reason is that the most cheapest old models used in auto are designed to agree with everything you say if I can exaggerate a bit. If you say its a gpt4.1 it will eventually agree, if you confront it and say No, you are claude it will agree with that too. Annoying but thats the way its designed.

Another reason, and this been covered ALOT on these forums, could be that you have to ask WHY cursor choose NOT to be transparent on what model exactly is used in auto mode. Some claim its naive to think that auto mode is choosing the best premium model for you, but is actually a MUCH cheaper model doing if not all then most of the replies.

Lately I noticed a shift in how the auto model/models operate. The auto model used to reply “You are absolutely right” when asking a question, hence again agreeing rather than evaluate the question. That has stopped from a week or two back.

Now however, its almost ■■■■ impossible to get auto to do anything right. If I ask it to add item on top, it adds it on the bottom. If I ask it to move item to the right it moves it to the left. If I ask it to fix an issue it instead adds another fallback system. On the border to completely useless.

My thoughts are that the old auto mode was a single old cheaper GPT model and now its cursors own model. That would make the most sense from a business standpoint.