When using automation cloud agents with the Composer 2 model, the agents were hitting cache reads on every automation run and on follow up prompts. Today, using the exact same prompts, it is barely touching cache reads and is mostly using input tokens instead.
For example, a PR review yesterday would use mostly cache read tokens with only a small amount of input tokens. Today, it shows zero cache read tokens and a large amount of input tokens.
Has something changed in the last 24 hours?
It seems to be effecting IDE and CLI as well. Here’s the results using the same prompt. The top two are from the same chat in ask mode. It completely ignored ask mode and implemented it (top entry).
Hello, I got Pro literally last night and was doing some work. I was checking to see how the usage goes compared to the enterprise account we have. After using for like an hour or 2, I have noticed a spike on consumption on each prompt. I was able to point out composer 2 or Auto(which is probably same) does not use any cache read and just drains my Read usage directly.
as an example on auto: Cache Read 0, Cache Write 0, Input 5,674,419 , Output 25,223 , Total 5,699,642
this is claude-4.6-opus-high-thinking an hour earlier: Cache Read 1,574,949, Cache Write263,971, Input 56,095, Output 24,699, Total 2,019,714
If I used like 3-4 hours, first half consumed my 3% while second half 26%
Their support agent says this is definitely a bug from backend, and I am hoping they will address this as this is a horrible to see within few hours of usage.
Hey, thanks for the detailed report. The table data really helps.
This makes the situation clear: Composer 2 shows 0 cache reads, while Composer 2 Fast, Auto, and Composer 1.5 cache normally. It looks like a backend caching issue specific to Composer 2.
I’ve shared this with the team. No ETA yet, but your report and the confirmation from @Noravus helps us prioritize it.
As a workaround, if this is blocking, you can temporarily use Composer 2 Fast. Based on your data, it seems to cache correctly.
If you have a Request ID from one of the affected requests right top corner of the chat > Copy Request ID, please send it over. It’ll speed up the server-side investigation.
Could this be why last month I was able to get what felt like plenty of usage, whereas today alone (my usage reset to zero this morning) I have somehow eaten up 40% of my auto usage? I have no clue how this could be true.
Yeah, that’s very likely related. If you’re using Composer 2 or Auto (which can pick Composer 2), then no cache reads means the whole context is sent as input tokens, so your usage goes up fast.
As a workaround, try switching to Composer 2 Fast for now. Based on that thread, caching works fine there.
Can you share a couple things so we can check:
Which model are you using (Composer 2, Auto, or something else)?
A Request ID from one of the heavy requests: top right of the chat > Copy Request ID.
The team is aware of the issue, and your report helps with prioritization.