Prompt caching with Claude

Prompt Caching is a powerful feature that optimizes your API usage by allowing resuming from specific prefixes in your prompts. This approach significantly reduces processing time and costs for repetitive tasks or prompts with consistent elements.

Prompt caching with Claude allows developers to store frequently used context between API calls, reducing costs by up to 90% and latency by up to 85%.

We’re mainly excited about supporting longer context workloads at much lower latency. Expect features that use this in the coming weeks.

Thank you!

and hopefully more or equivalent uses :slight_smile:

Cool! I can’t wait.

caching cursor, nice ring to it

When ? :slight_smile:

Excited!

@amanrs Hi super excited for this feature. Is there an update on the timeline?

Any news about this?

At the moment the cached prompts have a lifetime of only 5 minutes.
So unless Cursor gets a special cache lifetime it’s not a huge improvement.
And caching costs 25% more so if you cache unimportant context you might end up paying more than if you didn’t as it expires in 5min.

Bumped. @amanrs Any updates on prompt caching with Claude? Has this been integrated?

Gemini also has context prompt / context caching
Context caching  |  Gemini API  |  Google AI for Developers (and its pricing Gemini API 定價  |  Google AI for Developers)

Any updates on this topic?

While we won’t discuss too many details here, we do utilize some efficiency features behind the scenes like Prompt Caching.