Let us choose whether we want to use 'Fast' requests, and allow us to use 'Slow' requests when desired

I think it would be fair to implement this feature now that slow requests are so dreadful and unusable if you need to get anything done.

1 Like

User Benefits:

  • Save your fast passes for when you actually need them
  • No more frustrating waits during crunch time
  • You’re in control of how you use your quota
  • “use it or lose it” model
  • Traffic spreads out naturally
  • Same server capacity no increase in maximum theoretical load
  • Slow lane should move faster
  • More natural distribution of fast request usage
  • Self-regulating system as users balance immediate needs vs. available quota

Business Benefits:

  • Reduced risk of customers abandoning subscription
  • Increased user satisfaction and perceived value
  • Feels more like a premium service
  • Same load, better distribution
  • Users self-regulate their usage
  • More predictable system patterns

Hey, the best way to work around this right now is to probably use different models for queries that may not require Claude 3.5 Sonnet!

Using a model like DeepSeek v3, which has similar performance to Claude without it being a premium model, could be great as a replacement to Claude to help you stretch out your fast request allowance across the month!

The slow pool is really intended as a backup for those who are just over on the fast requests limit for the month, which is why the slow pool gets slower the more it is used.