I have really been enjoying working in agent mode the last few days. Unfortunately, it still makes many mistakes and can’t take on tasks that are too involved (at least in the way I am using it).
I’m hoping that adding reasoning models to agent mode will help to improve this situation, but only Anthropic and gpt4o (not sure if other gpt models) are supported right now.
When will we have access to reasoning models in agent mode such as DeepSeek R1, the o1 series and especially the o3 series?
Hey, unfortunately, these reasoning models do not support the ability to call the tools that the agent relies on to work, so this is unlikely to ever work until the models gain support for this.
As a workaround, you could use a normal composer to generate the changes you want to make, them copy them into the agent for it to implement them. We are also looking at using o1 in the agent, by giving the agent a “planning tool”, that calls o1 to do exactly as I’ve described, but we are still working on this internally and do not yet have an ETA for this.
Hey, we wanted to provide users with v3 quickly, as it was highly requested, but we still need to do more work and testing to ensure it would work well with agents!
I’ll throw this to the team, and we may get this working in a future update.
Great to hear this! People (including myself) seem to be having some strange issues with Cursor Agent mode with o3-mini, like it not applying its changes to the files.
Yeah, having the same experience - sometimes he finishes in the middle, others he says gonna implement and stops, seems to not working as expected right now.
For me the best agent mode untill now is sonet-haiku. It implements everything, the way i ask, tests, fix lints, run all commands and just finishes when everything is done!
I agree, I still use Claude + DeepSeek r1 in cursor.
o3-mini seems very good in some “reasonings” but even the implementation in cursor seems unusable, it fails to apply changes, loses direction, confuses files, etc. When you get it to a point where it understands that it must analyze something, or proposes a certain refactoring, it is impossible to get it out of the “I will proceed to apply it” loop, it’s as if you had to insist dozens of times.