When Claude 3.5 first came out, it was incredibly efficient. It saved a lot of time, especially on many projects I worked on through the editor. However, as the release of 3.7 approached, I started noticing a decline in 3.5’s performance. It wasn’t just a general feeling; I realized that when giving the same tasks, the model, which previously performed them well, was now starting to make mistakes or produce incomplete results. Even simple tasks started to pose problems.
When 3.7 was released, most of these issues seemed to be resolved. Initially, it worked very well and was stable. But after a while, I started to notice a similar pattern. The way the model approached certain tasks changed, and the error rate increased. Sometimes it would develop assumptions on its own, treating things as if they existed when they didn’t, and producing responses based on these incorrect assumptions. This, inevitably, made the model feel as though it had been weakened.
In general, it’s clear that models are updated and changed over time. But seeing that these changes aren’t always improvements is really frustrating, especially for users who rely on these tools for their work. You rely on a model, build your workflow around it, and then, over time, it no longer delivers the same performance.
The biggest difference between 3.5 and 3.7, in my opinion, is context management. I used 3.5 extensively, especially on projects with multiple files, and it quickly became an essential part of my workflow. Its potential, particularly when paired with Cursor, allowed me to achieve more success than before. However, as the complexity of my projects increased and the number of files grew, serious problems began to arise. The model started to forget context, become confused, or generate irrelevant answers. In contrast, 3.7 handled these situations much more effectively — it struggled much less with complex tasks and context management, showing significant improvement compared to 3.5 in these areas.
However, over time, even this powerful model started to degrade. Honestly, I don’t understand why this happened. Was it for optimization, cost-saving, or something else? It’s unclear whether this issue was caused by Claude, Cursor, or perhaps both parties played a role in the changes that led to this. But no matter the reason, it feels like a disregard for the users.
You release a product, people get used to it, trust it, and build their workflows around it. Then, over time, you intentionally degrade the old product’s performance and release a new version, saying, “Here, now use this.” Deliberately weakening the old product and marketing the new one as a slightly better alternative is, in my opinion, unethical. This isn’t a sales strategy; it’s manipulation of users. Moreover, such approaches will lead to a loss of trust in the long run. Because eventually, people will start questioning: How can we be sure the next version won’t be intentionally weakened too?
I’m sure I’m not the only one experiencing this issue — I’m confident that many others are facing the same or similar problems as I am.
In the future, I may be able to explain the problem I’m experiencing in more detail and clearly by providing examples from the coding part, but I’m not sure if I have the patience to do so.
For now, I’ll stick to describing the issues I’ve faced only on a surface level.