How do cache reads and writes works?

condor · August 17, 2025, 11:11am

Cache reads and writes are handled automatically by AI provider to reduce your consumption cost by 70-90% depending on provider.

When you submit a request in Chat, the content is stored in cache for follow up requests and tool calls. Each follow up request and a tool call submits the history of that chat as input for AI processing, however with caching the provider knows that all previous part of the current chat was cached and does not need to tokenize it again.

You can see more details on token usage and how to reduce token consumption here:

Topic		Replies	Views
How to reduce cache reads Discussions	6	122	September 11, 2025
Different token types How To	6	351	July 14, 2025
How to disable Cache Write and Cache Read? Discussions	50	1066	July 29, 2025
Understanding LLM Token Usage How To	0	4578	July 20, 2025
How is the Claude-4-sonnet consumption record calculated Discussions	6	513	July 9, 2025

How do cache reads and writes works?

Related topics