Prompt caching with Claude

Prompt Caching is a powerful feature that optimizes your API usage by allowing resuming from specific prefixes in your prompts. This approach significantly reduces processing time and costs for repetitive tasks or prompts with consistent elements.

Prompt caching with Claude allows developers to store frequently used context between API calls, reducing costs by up to 90% and latency by up to 85%.

11 Likes

We’re mainly excited about supporting longer context workloads at much lower latency. Expect features that use this in the coming weeks.

17 Likes

Thank you!

1 Like

and hopefully more or equivalent uses :slight_smile:

1 Like

Cool! I can’t wait.

1 Like

caching cursor, nice ring to it

1 Like

When ? :slight_smile:

1 Like

Excited!

1 Like

@amanrs Hi super excited for this feature. Is there an update on the timeline?

3 Likes

Any news about this?

2 Likes

At the moment the cached prompts have a lifetime of only 5 minutes.
So unless Cursor gets a special cache lifetime it’s not a huge improvement.
And caching costs 25% more so if you cache unimportant context you might end up paying more than if you didn’t as it expires in 5min.

2 Likes

Bumped. @amanrs Any updates on prompt caching with Claude? Has this been integrated?

Gemini also has context prompt / context caching
Context caching  |  Gemini API  |  Google AI for Developers (and its pricing Gemini API 定價  |  Google AI for Developers)