Best LLM Benchmarks for code?

DanDan · December 19, 2024, 12:33pm

Hi!

I have been struggling to find good benchmarks for LLMs to use with coding. Now we have ~10 models to choose from all with pros/cons. Does anyone know of any reliable benchmark that is up to date? (Meaning new o1 etc, maybe even Google Gemini 2)

deanrie · December 19, 2024, 5:43pm

I only know Swebench, which is focused on coding, but it’s specifically about tools, not models. There’s also Lmarena in the coding section, where you can see a leaderboard of models.

https://lmarena.ai/

Topic		Replies	Views
Why are agents (including Cursor) not ranked on coding benchmarks? Discussions	4	613	May 14, 2025
Livebench - unbiased for coding LLM Discussions	1	92	May 14, 2025
Coding benchmarks for o3, and o4-mini Discussions	6	2424	April 18, 2025
Which AI model in Cursor AI is best for coding tasks Discussions	2	5514	December 2, 2025
Which AI Model Should I Use for Programming? Discussions	9	15565	December 10, 2025

Best LLM Benchmarks for code?

Related topics