Why does one message cost over 200 requests?

It’s so expensive ! Is this expected?

I think there’s a serious issue with request accounting here.

According to the model section, Opus 4.6 Fast / Thinking is marked as . That implies one user message should cost around 6 requests, not hundreds.

However, as shown in the screenshot, I sent one very small, simple message, just to test the model, and it consumed ~232 requests in a single entry.

To be clear:

  • I sent one message

  • The model UI indicates , not 200×+

  • No batch, loop, agent workflow, or repeated prompts

  • Yet over 200 requests were deducted at once

If internal backend calls are counted separately, this is not clearly communicated and directly contradicts what the model selector implies. From a user’s perspective, this looks misleading and unfair

3 Likes

Hey @Jonas_Mikkel_Mind!

Max mode bills at token-based prices (the total cost of tokens is estimated at an equivalent number of requests) on request based plans. Alongside fast mode, it is just that expensive.

1 Like

you were lucky ^^

look at this ^^

image

But damn it was fast and fixed 4 errors without asking (introduced 1 though)

1 Like

Just use GitHub - Artemonim/AgentEnforcer2: A reference architecture for building robust, language-agnostic local CI systems :smiling_face_with_sunglasses::+1:

will take a look, Thank you Artemonim