Display number of tokens in current chat

litecode · August 2, 2024, 3:10am

I am not sure if this information is easily available, but I just noticed that when mentioning files with the @ feature in Ctrl + L in Long Context Chat mode, there is a handy feature that displays the token length of the file next to the file name.

That is very cool.

It made me wonder if it would be possible to display the cumulative token length of the chat somewhere - i.e. where the value was updated as the chat progressed, just to see how large the chat was actually getting.

In case it makes a difference, I saw the token length of the files in this scenario:

(the ‘long context chat option’ is enabled in Cursor beta settings area)
Ctrl + L
Select Long Context Chat mode with claude-3-5-sonnet-200k
Press @ to mention some files

long_context_chat_file_token_length

nico · August 17, 2024, 12:51pm

Bumping this awesome idea.

Sounds like a great feature that would enhance our ability to manage token usage better!

Also, maybe a display to show remaining tokens allowed for that chat.

litecode · August 17, 2024, 1:09pm

Definitely need this .

I have been thinking recently, I have a very poor conceptual intuition of token size when it comes to chat length.

In fact, this evening I googled something like:

how to visualise token length in ai models

And it returned results like these:

**01)** What are tokens and how to count them?
https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them

Takeaway:

Here are some helpful rules of thumb for understanding tokens in terms of lengths:

1 token ~= 4 chars in English

1 token ~= ¾ words

100 tokens ~= 75 words

Or

1-2 sentence ~= 30 tokens

1 paragraph ~= 100 tokens

1,500 words ~= 2048 tokens

To get additional context on how tokens stack up, consider this:

Wayne Gretzky’s quote “You miss 100% of the shots you don’t take” contains 11 tokens.

OpenAI’s charter contains 476 tokens.

The transcript of the US Declaration of Independence contains 1,695 tokens.

**02)** Visualizing Token Limits in Large Language Models
https://galecia.com/blogs/jim-craner/visualizing-token-limits-large-language-models

Takeaway:

“This sentence contains six tokens.” has 6 tokens and 36 characters.

The Gettysburg Address has 310 tokens and 1,453 characters.

The US Declaration of Independence has 1,638 tokens and 8,147 characters.

Anne of Green Gables, chapter 1 has 3,549 tokens and 15,585 characters.

**03)** Visualizing the size of Large Language Models
https://medium.com/@georgeanil/visualizing-size-of-large-language-models-ec576caa5557

Takeaway:

If we assume a typical Book contains ~100,000 Tokens and a typical Library shelf holds ~100 books. Each Library shelf would contain about 10 million Tokens.

I still don’t think these figures help me guess my chat lengths yet!

When things are getting slow, or I want to do something big, I just switch to a long-context chat and hope for the best!

wm9 · August 17, 2024, 8:36pm

+1
one thing I never know is if I tag certain files or folders as context… do I need to tag them again in subsequent Q/A or does it always retain it’s memory.
And if it already does, and I tag them again, am I necessarily wasting tons of tokens

litecode · August 28, 2024, 5:17pm

Hi @wm9 ,

I think that question is worthy of its own topic!

Something like:

Do I need to re-tag files and folders in a chat or do they persist in context once added?

murad · December 19, 2024, 2:40pm

I wonder how this guy did it in this video:

Topic		Replies	Views
[Feature Request] (Local) token counter as implemented with Xenova/the-tokenizer-playground Feature Requests	2	199	August 28, 2024
Using Long Context Bug Reports	2	430	August 21, 2024
Please allow max number of tokens supported by the models Feature Requests	2	2201	August 5, 2024
Feature Request: Input Token Estimation Feature Requests	1	28	July 21, 2025
Interpreter mode broken again Discussions	2	63	August 1, 2024

Display number of tokens in current chat

Related topics