I’m not an expert, but I think when working directly with some models (not Cursor as a 3rd party), you can enable/disable caching, which if enabled incurs a charge. With Cursor, it seems like they are automatically enabling it or having the model decide when to cache. In regards to the general question why cache costs, I assume because it uses additional resources and is optional.
Why caching exists as a separate cost: every time you submit a prompt, if there was no caching, you would have to pay the higher input fees every single time. With caching enabled, it allows you to have multiple prompts within a timeframe that don’t need to resend the same large context input (ie. large text files) with each prompt. So the cost of writing cache and reading cache is generally cheaper then paying the input rate with each prompt.
When sending prompts in Cursor, you can see the context grow. I believe this is similar to what is cached, giving you an idea how cache intensive this chain of requests are.
Here is a response about caching from a Cursor rep.
Another helpful post
