Is This an Agent Problem or a Model Problem?

After two weeks of frustratingly poor results with Cursor (around a 10% success rate), I decided to subscribe to Windsurf to see if it would perform any better. But the outcome left me even more confused.

With Cursor, I tried many models, even downgraded to older versions, but the performance was consistently worse compared to the amazing results I was getting a few months ago. I was hoping Windsurf would be different, but after testing multiple models there too, the behavior was eerily similar: very poor results and almost identical mistakes.

This got me thinking, is the issue with the agents themselves? Or have the models actually gotten worse recently? Another theory: maybe these code editors are quietly reducing the context window to save on costs.

The only thing that consistently saved me when both Cursor and Windsurf failed was Google AI Studio and its new Gemini 2.5 Pro model. It not only identified the bugs that both Cursor and Windsurf introduced but also fixed them with impressive accuracy.

So now I’m wondering: is this a limitation of the agents? Are the models being throttled? Will Gemini 2.5 Pro save these programs?

Curious to hear what others are experiencing.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.