Can we have a feature where we can cap the max number of different tokens usage (output, cache read..).
This is useful for automated workflows, so I don’t have to worry about the agent going crazy and consuming a lot of tokens where I don’t except it to do.
As a workaround for now, I process the streamed output using a wrapper and abort the request after exceeding a max length of characters.
You’re probably already aware of these two features, but right now, the two closest options to this are carefully setting your on-demand usage limits, and keeping a close look at your usage on https://cursor.com/dashboard/usage, which shows both your included usage as well as your on-demand API usage.
Another helpful tip is to configure Cursor to always show your usage limits, which you can enable in Cursor settings → Agents → Usage Summary and changing it from the Default Auto to Always.
I know it’s not exactly what you’re looking for, but hopefully this can help you keep track of your usage more closely.