Congratz on the 3.7 improving, keep it going please

Great product! Keep up the excellent work!

Before the last update, I switched to Claude Code, but with this one (0.47), I can see what you did there :wink:. You’re making significant progress now. Since I use both, I can see the differences pretty fast. I think you should consider introducing optional tiers with adjustable costs based on the number of requests, if that would present a better quality response, but I think it will do. Considering the last change you did was in the prompt itself? Here’s a suggestion:

  1. Tier 1 - 1x request (lower quality)
  2. Tier 2 - 2x requests (current level, which seems comparable to Claude)

    Tier Yolo - Unlimited requests at a premium cost (just as Claude Code does, which is quite expensive but useful sometimes)

The /cost command in Claude code is a brilliant innovation! It could very well become a standard feature in future agents by providing flexible cost limits for each request. Adding a similar capability would enable users to track their costs live, giving them the flexibility to adjust based on their preferences and budget. I hope you’re already working on this—keep pushing the boundaries and enhancing your offerings!

P.S.: My tests were all made up in Rails, vanilla JS, Hotwire, and all that Ruby/Rails stuff, along with bash tooling.

Why would someone want lower quality code? What am I not getting here…?

Different AI models, like Gemini and Claude, can produce varying results depending on the task. Claude 3.7 might be performing better because of improved prompts or more processing time, though Im not entirely sure. Increasing Claude’s power by doubling its request size seems to have boosted its performance in this version. When I use Claude Code, it takes sometimes over 120 seconds to respond or think, which I haven’t seen with Cursor. This makes me think Cursor could also improve by increasing its power or request size, or something? Idk…

Lower Quality would be like, 1x request, fast thinking, fast response…

Why would someone want lower quality code?

No one would.
I think labeling it “lower quality” isn’t accurate.

What people would want, is the same quality code/solutiom using 1 request.
Not every thinking task requires 2 requests.

The token budget is easily controllable by API, so having choice is ideal.

It might seem like it complicates things, but users will already have to decide whether they want to use 2x requests, or switch to an entirely different model (which will have less predictable results than the same model with a lower thinking budget.)

I think they just released what I was talking about, the MAX mode … LOL !