Are there published benchmarks for Cursor Agent mode?

benthamite · January 5, 2025, 7:26pm

I have been using agent mode and finding it helpful. Are there any published benchmarks about its performance? E.g. on SWEBench-verified or AgentBench?

danperks · January 7, 2025, 5:53pm

Not currently, as composer and Cursor as a whole is more intended as an interactive way of coding and not a one-shot LLM that can be compared to any other AI models directly!

It will be interesting to see how Claude stacks up against itself when used within Cursor’s Agent Mode though!

Topic		Replies	Views
Composer agent mode is insane Discussion	1	637	December 26, 2024
Composer agent generated code is suboptimal Feedback	0	58	December 6, 2024
Plans to support custom Composer Agents or Cursor APIs? Feature Requests	2	262	March 15, 2025
10 agents concurrently @ Cursor IDE Showcase	7	1226	November 27, 2024
More model options for Composer agent mode please! (e.g. Gemini 1.5 Pro) Feature Requests	3	318	February 12, 2025

Are there published benchmarks for Cursor Agent mode?

Related topics