-
Have you used GLM 5.1 or Composer 2 in your workflow?
-
Can they really match the quality of Opus 4.6 or GPT-5.4?
-
Which model performs best for complex coding tasks in your experience?
-
Which one is more accurate for deep analysis and reasoning?
-
Are cheaper models now “good enough” to replace the top ones?
-
Has pricing made you switch from Opus/GPT to GLM or Composer?
-
Do you rely on a single model or switch between models per task?
-
What is your main go-to model right now and why?
-
Do you think the gap between models is closing, or still significant?
Share your real experience 
1 Like
In my experience, models like Opus and the latest GPT versions are still significantly more capable than Composer or GLM when it comes to real intelligence and handling complexity.
For simple tasks, like writing small utility functions or basic scripts, Composer and GLM can perform at a similar level. But once the task becomes more complex and requires deeper reasoning, multi-step logic, or strong contextual understanding, they tend to fall short. The responses are often less reliable and not as well-structured.
Another important difference is evaluation and benchmarking. Models like Opus and GPT are consistently tested on rigorous benchmarks such as HLE, ARC-AGI-2, and SWE-bench Pro, which reflect real-world reasoning and coding challenges. Composer and GLM don’t appear to be evaluated as transparently on these standards, which makes it harder to trust their performance at the same level.
Overall, I still rely on top-tier models for complex work because they are more accurate, consistent, and dependable. Lower-cost models are improving and can handle simpler workflows well, but they’re not yet a full replacement when high-quality reasoning and precision are required.
I follow a simple principle: use the most intelligent models for designing and planning, and switch to more cost-effective ones for building and implementation.
2 Likes
Thank you, what other models do you use?