FIVE PRIMES
Every byte splits into 5 independent channels via the Chinese Remainder Theorem. Type anything below. Watch it decompose.
Each byte n maps to (n mod 2, n mod 3, n mod 5, n mod 7, n mod 11). The channels are provably independent by CRT. Information splits, not copies.
Each cell = one byte. Color = CRT channel blend. Hover for decomposition.
Mutual information between channels (bits). Independent channels = 0. Real text has structure, but the channels themselves are algebraically independent.
Modern AI tokenizers (BPE, SentencePiece, tiktoken) are statistical. They learn patterns from data. CRT decomposition is algebraic. It sees structure that statistics can't.
The theoretical redundancy at the byte level (N=210) is 20x by the Rissanen theorem. This means the joint information in 5 independent channels is fundamentally less than the information needed to encode the original byte stream.
| METHOD | TYPE | |
|---|---|---|
| CRT | 5 algebraically independent channels. Provably optimal decomposition for mod-210 data. Rissanen 20x theoretical redundancy. | Algebraic |
| BPE | Statistical subword merging. GPT-4 uses ~100K token vocabulary. Learns patterns, doesn't see ring structure. | Statistical |
| zstd | Best general compressor: ~3-4x. Dictionary + entropy coding. No algebraic decomposition. | Statistical |
The ring Z/210Z = Z/2 x Z/3 x Z/5 x Z/7 has 48 units (phi(210) = 48).
Adding the L=11 channel: Z/2310Z. Rissanen redundancy: 936x at token level.
CRT = Chinese Remainder Theorem. Ring: Z/210Z = Z/2 x Z/3 x Z/5 x Z/7.
With L=11 error correction: Z/2310Z. All computations in-browser, no server.