Someone please explain - Why are cache read and write chargeable?

MidnightOak · March 4, 2026, 11:35am

I’m not an expert, but I think when working directly with some models (not Cursor as a 3rd party), you can enable/disable caching, which if enabled incurs a charge. With Cursor, it seems like they are automatically enabling it or having the model decide when to cache. In regards to the general question why cache costs, I assume because it uses additional resources and is optional.

Why caching exists as a separate cost: every time you submit a prompt, if there was no caching, you would have to pay the higher input fees every single time. With caching enabled, it allows you to have multiple prompts within a timeframe that don’t need to resend the same large context input (ie. large text files) with each prompt. So the cost of writing cache and reading cache is generally cheaper then paying the input rate with each prompt.

When sending prompts in Cursor, you can see the context grow. I believe this is similar to what is cached, giving you an idea how cache intensive this chain of requests are.

Here is a response about caching from a Cursor rep.

Another helpful post

Topic		Replies	Views
How do cache reads and writes works? Discussions	1	3829	August 17, 2025
Why I've paid 55k tokens on some cache read on new chat Help context	1	54	March 8, 2026
Cursor high token usage Help context , byok , large-codebases	8	381	April 21, 2026
How to use it to save more API Discussions byok , anthropic	1	84	April 13, 2026
Why is a simple edit eating 100,000+ tokens? Let’s talk about this Discussions	107	10666	February 11, 2026

Someone please explain - Why are cache read and write chargeable?

Related topics