Different token types

Reznal · July 4, 2025, 8:16am

Curious on what each token type is?
Also curious if we know anything about the cost of each type and while not knowing the limits, if certain tokens use up that limit further and by how much?

As far as my understanding goes:
Input: Prompt
Output: The.. output on the llm
Write Cache: Writing to a local (?) cache
Read Cache: Reading from a cache instead of thinking about something previously discussed

I’m curious about this as my limits are hit so quick, and looking at my Usage majority of the tokens I’m using are Read Cache, but I don’t think that should really effect the rate limits as much the other 3?

condor · July 4, 2025, 10:58am

A post was split to a new topic: Can AI Pro connect to DBeaver?

condor · July 4, 2025, 10:57am

@Reznal

Input tokens: request, attached files, rules, docs, read files, MCP output …
Output tokens: Chat text, Code, data sent to MCP from AI,…
Write Cache: previous chat messages/context to be saved for the session
Read cache: reading cache without having to fetch data from users computer, basically temporary session data

Read cache is the cheapest of all, so thats not an issue

Reznal · July 4, 2025, 12:55pm

Thats for the details
Getting an api cost for each in each model would be amazing.
Although most of my prompts that have any size majority of the tokens are Read Cache and I still get rate limited nearly instantly.

condor · July 4, 2025, 12:58pm

Yeah, there is a feature request for this and I forwarded it to Cursor Team.

gram12321 · July 8, 2025, 11:27pm

Okay its fine that Read Cache is the cheapest of all, but if it goes bezerk on the amont it might still be a issue. Have a look at this log

token-based usage calls to claude-4-sonnet, totalling: $2.62. Input tokens: 1120, Output tokens: 10751, Cache write tokens: 117388, Cache read tokens: 5252615

For 1120 input tokens and 10751 output tokes. over 100k cache write and +5million cache read tokens ??? I wonder if this explains why I as a casual hobby coder hit the new rate limit on all models in a matter of minutes.

condor · July 14, 2025, 5:04pm

@gram12321 is this from a long chat?

Topic		Replies	Views
How to disable Cache Write and Cache Read? Discussions	50	1426	July 29, 2025
How do cache reads and writes works? Discussions	1	789	August 17, 2025
How is the Claude-4-sonnet consumption record calculated Discussions	6	549	July 9, 2025
How to reduce cache reads Discussions	6	307	September 11, 2025
Give me example how cost is calculated based on tokens Feedback	8	237	September 16, 2025

Different token types

Related topics