Definitely need this .
I have been thinking recently, I have a very poor conceptual intuition of token size when it comes to chat length.
In fact, this evening I googled something like:
how to visualise token length in ai models
And it returned results like these:
**01)**
What are tokens and how to count them?
https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
Takeaway:
Here are some helpful rules of thumb for understanding tokens in terms of lengths:
Or
-
1-2 sentence ~= 30 tokens
-
1 paragraph ~= 100 tokens
-
1,500 words ~= 2048 tokens
To get additional context on how tokens stack up, consider this:
-
Wayne Gretzky’s quote “You miss 100% of the shots you don’t take” contains 11 tokens.
-
OpenAI’s charter contains 476 tokens.
-
The transcript of the US Declaration of Independence contains 1,695 tokens.
**02)**
Visualizing Token Limits in Large Language Models
https://galecia.com/blogs/jim-craner/visualizing-token-limits-large-language-models
Takeaway:
“This sentence contains six tokens.” has 6 tokens and 36 characters.
The Gettysburg Address has 310 tokens and 1,453 characters.
The US Declaration of Independence has 1,638 tokens and 8,147 characters.
Anne of Green Gables, chapter 1 has 3,549 tokens and 15,585 characters.
**03)**
Visualizing the size of Large Language Models
https://medium.com/@georgeanil/visualizing-size-of-large-language-models-ec576caa5557
Takeaway:
If we assume a typical Book contains ~100,000 Tokens and a typical Library shelf holds ~100 books. Each Library shelf would contain about 10 million Tokens.
I still don’t think these figures help me guess my chat lengths yet!
When things are getting slow, or I want to do something big, I just switch to a long-context chat and hope for the best!