Codex models pricing - 30 cents per request

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

I am getting charged 30 cents for a worse version of codex, somehow in that same model line the higher-end one is like a third of the price??? wtf man

Steps to Reproduce

work with codex

Screenshots / Screen Recordings

Operating System

MacOS

Current Cursor Version (Menu → About Cursor → Copy)

Version: 2.4.0-pre.13.patch.0 (Universal)
VSCode Version: 1.105.1
Commit: e459e7c07c1dbd09c479315377cffb5cc7fe46b0
Date: 2026-01-09T21:11:29.306Z
Build Type: Stable
Release Track: Nightly
Electron: 37.7.0
Chromium: 138.0.7204.251
Node.js: 22.20.0
V8: 13.8.258.32-electron.0
OS: Darwin arm64 25.1.0

For AI issues: which model did you use?

codex

For AI issues: add Request ID with privacy disabled

65bb419c-1919-4fc3-a003-9b21be666c4d

Additional Information

what is this man? is this a bug??

Does this stop you from using Cursor

Yes - Cursor is unusable

is this prompt caching or somth? because come on man wtf is this

hi @tejasPhaveri, thanks for the report. Cursor shows token usage exactly as returned by the AI provider for each request.​

You can open Dashboard → Usage to inspect that specific request in detail, including whether caching was applied. The Billing & Invoices page only shows an aggregated summary.​

In general, caching can reduce input token costs by around 90%, but it only helps when there are follow‑up requests that can reuse the same context. From the two highlighted lines in your screenshot, it looks like the -max request was able to cache most of its input tokens, while the single non‑-max call could not benefit from caching because there were no subsequent related requests. The first request in a context is always billed at the full input token price, and in this case it also appears to have used a large context window.​

Could you share the detailed entries for those codex requests from the Usage log so we can double‑check what happened?​

usage-events-2026-01-10.csv (12.8 KB)

hope this helps

Thank you yes that was helpful. The one non-max request was indeed 100% input tokens as there was no follow up. Also it had all 232k tokens as full input context. Therefore the amount charged should be correct.

Do you recall if you had attached some files/logs/mcps/… as the total tokens are a lot.

nope, it was just a long convo, so ok, if its a long convo and i change models in between i get charged 30 cents per request. Is that correct?

req id: 65bb419c-1919-4fc3-a003-9b21be666c4d

It is not advisable to change AI models within an existing chat because on any model change we need to send the full thread to AI provider for tokenization and processing. in such an case there would not be a cache yet available and therefore full context would be processed at input cost.

gotcha, makes sense, curious tho, how does the subagents feature in the nightly build work? i see it goes off and gets context for the main chat, does that also have this input token cost thing?

How does this work with starting the chat with composer, and then switching to plan mode, and building the plan or implementing it with codex?

Subagents have their own ‘requests’. Any AI usage consumes tokens.

Switching from plan with one model and then build with another does the same sending of chat as new input but you have options

  • Plan is usually shorter and cheaper than implementation loops. Switching model not as critical for cost.
  • If cost is primary factor then save plan to file and start new chat with codex using the plan file

ah gotcha, sounds good, and nah was curious on how that worked, thanks