Not an issue with Cursor, but today gpt-5-high seems substantially less intelligent than it did yesterday. Solutions are much dumber, tools are failing or being mis-applied, instructions aren’t being followed… Has anyone noticed anything similar?
OpenAI seems to like this sort of bait-and-switch:
- Roll out a strong, new model which performs better than the competition in benchmarks and real life
- Allow people to test and review the new model, generating hype and good publicity
- Take market share from competition as people make decisions based on reviews, vibes, and benchmarks
- When the hype-cycle dies down, quietly nerf the model with pruning, which hurts raw intelligence but maintains benchmark performance, while making the model smaller and cheaper to run inference on