I started using the latest Sonnet, and my requests keep timing out. Ditto for the older Sonnet. It seems really bursty and it’s really frustrating to have no idea whether my inferences are going to work or not. I’ve started adopting a very different style of work between the different models. And so the moment a model is suddenly unavailable, it sort of feels like a “context switch” in my head and substantially impairs my productivity.
I’m curious if it makes sense to put up a dashboard showing overall available for the different models? For example something like https://status.cloud.google.com/ that shows how close each of the models are to Cursor’s overall quota.
I understand that Cursor is probably constrained by Anthropic’s overall GPU capacity. But I wonder if it would be easier for Cursor to negotiate with them for preferential access to their overall pool if there were clearer pricing signals.
I’m willing to pay more if I can have more predictability around quota.