FIVE PRIMES

CRT Tokenizer

Every byte splits into 5 independent channels via the Chinese Remainder Theorem. Type anything below. Watch it decompose.

0
bytes
0
bits/byte (joint)
0
bits/byte (sum of channels)
0
redundancy ratio

The Five Channels

Each byte n maps to (n mod 2, n mod 3, n mod 5, n mod 7, n mod 11). The channels are provably independent by CRT. Information splits, not copies.

Byte Map

Each cell = one byte. Color = CRT channel blend. Hover for decomposition.

Channel Independence

Mutual information between channels (bits). Independent channels = 0. Real text has structure, but the channels themselves are algebraically independent.

Why This Matters

Modern AI tokenizers (BPE, SentencePiece, tiktoken) are statistical. They learn patterns from data. CRT decomposition is algebraic. It sees structure that statistics can't.

The theoretical redundancy at the byte level (N=210) is 20x by the Rissanen theorem. This means the joint information in 5 independent channels is fundamentally less than the information needed to encode the original byte stream.

METHODTYPE
CRT 5 algebraically independent channels. Provably optimal decomposition for mod-210 data. Rissanen 20x theoretical redundancy. Algebraic
BPE Statistical subword merging. GPT-4 uses ~100K token vocabulary. Learns patterns, doesn't see ring structure. Statistical
zstd Best general compressor: ~3-4x. Dictionary + entropy coding. No algebraic decomposition. Statistical

The ring Z/210Z = Z/2 x Z/3 x Z/5 x Z/7 has 48 units (phi(210) = 48).
Adding the L=11 channel: Z/2310Z. Rissanen redundancy: 936x at token level.

CRT = Chinese Remainder Theorem. Ring: Z/210Z = Z/2 x Z/3 x Z/5 x Z/7.
With L=11 error correction: Z/2310Z. All computations in-browser, no server.