it is open source model. hope team use it.
As usual, the benchmarks won’t tell the whole story, and it’s unlikely to generalize for coding nearly as well as r1.
A dedicated OpenRouter API slot which doesn’t require disabling every other model and bypassing our subscriptions would solve 99% of these requests to allow experimenting with new models.
BTW, you can try it below:
QwQ benchmarks from Paul Gauthier which tend to be much more reliable for judging real-world coding performance vs. LiveCodeBench etc.
2 Likes