Auto provides inconsistent Agent behavior

I feel that there is an inherent problem with Auto agent mode. I think cursor has some algorithm that decides which LLM is going to be used. Beyond the different LLM capabilities, it seems that LLMs are managed and create user interactions that are noticeably different. Yesterday, agent responses started presenting thinking process. The LLM seems unsure and confused by tasks that were just simply churned out the day before with no thinking presented. I feel that this diminishes trust and causes frustration. Auto mode should maintain a single stable contract of interaction with users, at least. Now, let’s assume I accept Auto mode as a cost optimization that works for the provider and the developer. I still want to know which LLM delivered the code assist. This is another trust diminishing aspect. It also shields the provider from important developer feedback. Cursor team should want to know which models cause frustration with developers because of subpar responses. I hope the Cursor team cares about trust and frustration. I would really like to see concrete steps to rectify this situation. Consistency of response quality is paramount. I used to manage a large team of developer, the current situation with Cursor Auto is tantamount to getting anonymous PRs. This would have been unacceptable when managing teams of developers, the same should apply to Cursor. Maybe I can make an adjustment and stop using Auto mode. But my understanding is that it is an important cost saving feature. Glad to get some feedback on this from the community and the Cursor team.

I agree with you. Different models work best with different prompting styles, write code differently, etc… using auto is rolling the dice and I don’t know why I’d ever want to do that.

I used to use auto for questions only, but even then quality of response can vary depending on model.

1 Like