Description
As a Pro subscriber, I’m experiencing significant issues with rate limit transparency that are severely impacting my development workflow. Despite reading all available documentation and contacting support, I cannot get specific answers about when limits reset or how to track usage against the “$20+ of model inference” allowance.
The core problem is that Claude 4 Sonnet is absolutely vital and crucial to my development process - in my experience, it’s the one and only truly great model for coding. Gemini 2.5 Pro cannot use tools, making Claude 4 Sonnet the only viable option for my workflow.
Current Issues
Rate Limiting Without Clear Reset Timing
- Hit rate limits on July 3, 2025, at 11:19 AM EST after only 36 minutes of usage
- After switching to Claude 4 Opus, was rate limited again after just one prompt
- It’s been over 5 hours with no reset (documentation says “every few hours”)
Performance Degradation
- Claude 4 Sonnet requests have been extremely slow with very few tokens per minute
- I suspect this is also a form of rate limiting that isn’t clearly communicated
Dashboard Limitations
- The dashboard at Cursor - The AI Code Editor shows usage events but no dollar consumption amounts
- All requests show “Included in Pro” but I still hit limits
- No way to track actual usage against the “$20+” burst limit
- Cannot predict when I’ll hit limits again
Documentation Gaps
I’ve thoroughly reviewed:
The documentation states:
- Pro users get “over $20” of model inference per month (burst limits)
- Local rate limits “refill fully every few hours”
- Rate limits depend on “model used, message length, attached file size, conversation length”
- “Sonnet” has higher limits than “Opus”
What’s missing:
- What does “over $20” translate to in actual Claude 4 Sonnet requests?
- What exactly does “every few hours” mean? (2 hours? 6 hours? 12 hours?)
- How much compute does each model actually consume per request?
- How can users track actual compute consumption against the $20+ limit?
System Information
Version: 1.2.1 (user setup)
VSCode Version: 1.99.3
Commit: 031e7e0ff1e2eda9c1a0f5df67d44053b059c5d0
Date: 2025-07-03T06:16:02.610Z
Electron: 34.5.1
Chromium: 132.0.6834.210
Node.js: 20.19.0
V8: 13.2.152.41-electron.0
OS: Windows_NT x64 10.0.26100
Dashboard Analytics (Week of Jun 26 - Jul 04):
- 9,608 Lines of Agent Edits
- 1 Tab Accepted
- No compute cost information displayed
Support Response Issues
When I contacted support directly, I received a template response that:
- Ignored all specific questions about timing and usage numbers
- Simply repeated the same vague documentation I’d already referenced
- Suggested upgrading to higher plans without addressing transparency concerns
- Failed to provide any actionable information
When Do Limits Actually Reset?
It’s been over 5 hours since my last successful request. If local rate limits “refill fully every few hours,” why haven’t they reset yet? How long do I need to wait to continue using Claude 4 Sonnet?
Similar Reports
This appears to be a widespread issue based on community discussions:
Cursor Forum Evidence
- Rate limit reset timing confusion: Claude 4 Rate Limit Reset: Cooldown Period vs. Billing Cycle Reset?
- Agent request limits clarification needed: Clarification on Agent Request Limits for Pro Plan
- Pro plan “unlimited” confusion: Pro Plan’s “Unlimited” U‑Turn
- Pricing transparency concerns: Pricing changes should be explicitly announced within the Cursor IDE
- General clarification post: Clarifying June 16 Pro Changes
Requested Solutions
-
Specific Usage Numbers:
- Exact Claude 4 Sonnet request limits for Pro plan
- Actual dollar consumption per typical request
- Clear examples of what constitutes heavy vs. normal usage
-
Reset Timing Clarity:
- Specific timeframes for “every few hours” (e.g., “every 4 hours”)
- Timezone and exact reset schedule
- Real-time countdown or status in dashboard
-
Enhanced Dashboard:
- Dollar consumption tracking against $20+ limit
- Remaining usage indicators
- Next reset time display
-
Improved Documentation:
- Update rate limits page with specific numbers
- Add FAQ section with common usage scenarios
- Include practical examples for different use cases
Why Upgrades Aren’t the Solution
I will not consider upgrading to Pro+ or Ultra plans when there’s no transparency about current usage. The $20 compute allowance should be sufficient for reasonable development work. Without clear usage tracking and reset timing, upgrading feels like paying more for the same unclear limits.
The issue isn’t the amount of usage allowed - it’s the complete lack of transparency that prevents effective usage planning and causes unexpected development blocks.
Has anyone else experienced similar issues with rate limit transparency? Any insights into actual reset timing or usage tracking methods would be greatly appreciated.