CRT Genomic Sequence Alignment

D28: Illumina / PacBio / 10x Genomics. CC0.

Genomic sequence alignment uses BLOSUM substitution matrices and dynamic programming -- expensive, heuristic, patented. CRT approach: encode each codon as a ring element in Z/12612600. The 6 CRT channels naturally separate: positions 1-3 map to D,K,E channels, amino acid identity to b channel, wobble degeneracy to L, GC content to G. Alignment = coupling distance. Synonymous mutations = small CRT distance (b channel preserved). The genetic code IS a CRT code.

How It Works

CRT Genetic Code Theorem
The standard genetic code maps 64 codons to 20 amino acids + stop. This degeneracy (wobble) is CRT error tolerance: synonymous codons differ only in the E channel (position 3) while the b channel (AA identity, mod 49) is preserved. CRT distance between sequences = sum of per-channel circular distances. Synonymous mutations have near-zero b-channel distance. Non-synonymous mutations create large b-channel jumps. L=11 = wobble class detector. The 490 split: DEAD={D,E,b} = genetic data, ALIVE={K,L,G} = validation channels.
64 codons
Ring elements
Each codon = one number in Z/12612600. 6 channels = 6 genetic properties.
Wobble = ECC
L=11 tolerance
Synonymous substitutions change E channel only. b channel (AA) preserved.
Coupling = distance
Evolutionary metric
CRT distance between codons = functional distance. Silent mutations are algebraically close.
No BLOSUM
No matrix
Substitution scoring from ring structure, not empirical log-odds matrices.

Align Sequences

Compare to reference (1-3):

Aligns variant against reference HBB sequence. Shows codon-by-codon CRT decomposition, synonymous vs non-synonymous classification, coupling distance.

Codon Table

32 representative codons showing CRT channel decomposition. Synonymous codons share b-channel values.

Evolutionary Distance Matrix

Pairwise CRT distance between all 4 sequences. Silent mutations = small distance, functional mutations = large distance.

CRT vs Traditional Alignment

ScoringBLOSUM/PAM: empirical log-odds substitution matricesCRT: algebraic coupling distance. No empirical fitting.AlgorithmSmith-Waterman: O(mn) dynamic programmingCRT: O(n) per-codon comparison. No gap penalty heuristics.SynonymousRequires separate dN/dS ratio computationAutomatic: b channel (AA identity) preserved = synonymous.WobbleEmpirical wobble rules (third position tolerance)CRT E-channel IS the wobble position. Tolerance = modular distance.Error detectionQuality scores (Phred), separate pipelineL=11 channel deviation = sequencing error. Free from ring.Patent statusIllumina (sequencing+alignment), PacBio (HiFi), 10x GenomicsCC0. Public domain. Forever.

This work is and will always be free.
No paywall. No copyright. No exceptions.

If it ever earns anything, every cent goes to the communities that need it most.

This sacred vow is permanent and irrevocable.
— Anton Alexandrovich Lebed

Source code · Public domain (CC0)

Contributions in equal measure: Anthropic's Claude, Anton A. Lebed, and the giants whose shoulders we stand on.

Rendered by .ax via WASM DOM imports. Zero HTML authored.