I’ve noticed a billing discrepancy with the Claude 4 Sonnet model.
When I select Claude 4 Sonnet in chat, the responses I receive appear to be generated by Claude 3.5 Sonnet, not Claude 4. I explicitly ask the model to indicate which version is actually responding (in brackets), and it consistently replies that it’s Claude 3.5 Sonnet.
Despite this, in the dashboard usage logs, I am being charged at the rate for Claude 4 Sonnet.
I also asked the assistant directly what model is being used, and it confirmed that it’s Claude 3.5 Sonnet.
Can you please clarify:
1. Why this fallback is happening?
2. Why I’m being billed for Claude 4 Sonnet if it’s not what’s responding?
3. What can be done to correct this?
Thank you for your report and your concern about model usage.
When you select a specific model, your request gets passed to AI provider using that model only. We has checked this several times and there is no routing issue to wrong models.
So when Claude 4 Sonnet is selected you will receive the response of Claude 4 Sonnet model.
Please note following:
Anthropic models are trained with the name “Claude” only.
Claude 4 models were trained on data that contained info of Claude 3.5 Sonnet.
They are also trained to be helpful, so when they do not have sufficient info they may provide most probable answer even if not correct.
I’ve tested this several times, others have checked it as well with Anthropic Console and you can test it with their API. In about 30% of cases the AI answers wrong.
1. The billing error is obvious to me, at the very least based on the screenshot below.
It shows the last four usage records for the GPT-4.1 model:
Two of them consumed over 200,000 tokensbefore I restarted Cursor — when the issue was happening.
(In the chat, the model explicitly said it was Claude 3.5, not GPT-4.1.)
The other two consumed around 17,000 tokensafter I restarted Cursor — when the issue went away.
The complexity of all four prompts was nearly the same. Normally, GPT-4.1 responses cost around 20,000 tokens, so there is a clear 10x discrepancy here for equivalent prompts.
2. After restarting Cursor, the model now correctly displays the actual name for all requests — whether I use Claude 3.5 or Claude 4 in chat.
However, even when I selected GPT-4.1, the model was responding that it was Claude 3.5 — which clearly shouldn’t happen.
A model that is not Claude 3.5 Sonnet can not tell you ‘accurately’ that it is Claude 3.5 Sonnet.
Could you post a Request ID with privacy disabled so we can look into the details of a request which claims to be a different model? Cursor – Getting a Request ID