Definitely need this
I have been thinking recently, I have a very poor conceptual intuition of token size when it comes to chat length.
In fact, this evening I googled something like:
how to visualise token length in ai models
And it returned results like these:
What are tokens and how to count them?
Here are some helpful rules of thumb for understanding tokens in terms of lengths:
1-2 sentence ~= 30 tokens
1 paragraph ~= 100 tokens
1,500 words ~= 2048 tokens
To get additional context on how tokens stack up, consider this:
Wayne Gretzky’s quote “You miss 100% of the shots you don’t take” contains 11 tokens.
OpenAI’s charter contains 476 tokens.
The transcript of the US Declaration of Independence contains 1,695 tokens.
Visualizing Token Limits in Large Language Models
“This sentence contains six tokens.” has 6 tokens and 36 characters.
The Gettysburg Address has 310 tokens and 1,453 characters.
The US Declaration of Independence has 1,638 tokens and 8,147 characters.
Anne of Green Gables, chapter 1 has 3,549 tokens and 15,585 characters.
Visualizing the size of Large Language Models
If we assume a typical Book contains ~100,000 Tokens and a typical Library shelf holds ~100 books. Each Library shelf would contain about 10 million Tokens.
I still don’t think these figures help me guess my chat lengths yet!
When things are getting slow, or I want to do something big, I just switch to a long-context chat and hope for the best!