Apparently the team behind Aider has found out that combining R1’s architect mode + using claude as the coder/actor significantly outperformed O1 during the polyglot benchmarks, compared to just them being standalone.
Would help to have a native way to have this two mode be seamlessly integrated within one single prompt. and give us the ability to select which model to use for each mode for that SINGLE stream prompt.
Github source
Interestingly it wraps Deepseek response in <thinking></thinking> tags before giving it to Claude, I see no other valuable insights in the code.
It would be nice to know if it is on the Cursor roadmap and we can expect it in future versions.
I think that even on the price side there are new alternatives that could make it more competitive to implement r1:
In relation to Fireworks.ai it is much cheaper, although the context is smaller (164k vs 128k) and its throughput worse, but perhaps for a combined architect mode where sonnet is the one implementing the code it would be enough, just speculating.
I think Cursor should be a beta version. The newest features (architect etc) should be in this version and it should be completely experimental. Once the bugs are worked out, it can be added to the main release.
I too would very much like this (had similar positive experience with o1/o3 Mini followed by Claude). In fact, going further, what I would really like is the ability to, within Cursor:
Give a prompt simultaneously to N models: E.g., o1/o3 Mini, and R1, (and maybe Claude too), asking for detailed discussion of issues associated with implementing ABC, fixing apparent bug DEF, etc.
Give ALL outputs of above to (likely) Claude, with directive to read through all responses, assess and integrate each, note any observed inconsistencies, ask questions where necessary, etc. — and then generate code.
Might even want to then give generated code back to N “higher level” models in one go, asking each for their assessment of said code.