For almost two years, I’ve had this burning question:
What if we feed a Large Language Model (LLM) input at the bit level — no tokens, no characters, not even bytes — just raw bits? Would LLM still work?
I waited and waited for someone to publish on the topic. Nothing. So l finally said, “Screw it,” and tried it myself. Guess what?
LLM actually works fine on the plain binary input — just a stream of zeros and ones powering the LLM!
Key facts:
- I built this on top of Andrej Karpathy’s famous nanoGPT.
- Using Cursor AI and Claude-3.5-Sonnet, I crafted a custom bit-level tokenizer, trained the LLM, and tested it — all in under 2 (two!) hours.
- I didn’t write a single line of code by hand this time.
Check out the GitHub sources and the Colab Notebook if you’re curious.
Honestly, I’m still a bit stunned.
What a time to be alive!
Dear Cursor AI team, thank you!