I was wondering this, because the benchmarks on long context understanding shows that gemini-2.5-pro might not have such of a problem with long contexts and I feel like cursor doesn’t differentiate between models to trigger that message. I bet it is a fixed threshold on the model context, but I think now that models are getting smarter, we might need different threshold rates for different models.
I’m saying that based only on empirical evidence when continuing using gemini-2.5-pro even after the long chat message appears (give it a try) and seeing almost as good results. Claude sonnet is still bad at long contexts though.
Since the context length of gemini 2.5 is larger than Claude’s, of course it will be better when dealing with longer context; but the performance degradation after 200k tokens even with a 1m context length is very noticeable. It is worth starting a new chat.