Understanding Cursor's token usage

To enhance my understanding, could a team member explain how context works with Cursor?

  • Is the context (e.g. my project files) sent with every chat, like how Chat GPT works? Does that mean that the VERY large context of the project is sent each time? Or does it send the current file each time? Or are embeddings created from the code context and updated as needed? etc.
  • Any desires to use OpenAI’s new Assistants API to maintain project state? Could this reduce token ongoing token consumption?

If you’re focused on a file, it will send that full file as the “Current file” with your message.

When you use codebase context, it won’t send your entire codebase with your message but only the most relevant snippets (from embeddings).

Cursor already handles chat state by itself. But the threads feature from the new Assistants API is nonetheless a pretty cool addition.

Are you saying that the threads feature wouldn’t help you at all, as you already have a similar enough feature in place on your servers? I’m wondering if any part of it could help reduce costs/increase context limit?

Reason I’m asking, is that sometimes it seems that Cursor doesn’t recognise previous context in chats. It seems like I’ve reached a limit, or it’s trying to cut on costs.

Generally it works well, but there are the odd occasions.