CRT Speech Codec

Fraunhofer / Opus / AAC / EVS. CC0.

Speech codecs (AAC, Opus, EVS) use Modified Discrete Cosine Transform to decompose audio into frequency bands, then quantize per-band. Patented: psychoacoustic models, bit allocation, entropy coding, packet loss concealment. CRT approach: encode speech spectral features (sub-bass, bass, midrange, presence, brilliance, air, ultrasonic) as ring elements in Z/214,414,200. 7 CRT channels = 7 algebraically independent spectral bands. Quantize each channel independently. mod-11 = error concealment for free. No MDCT. No psychoacoustic model. The ring structure IS the transform.

How It Works

CRT Speech Codec Theorem
Speech spectral frames encoded in Z/214,414,200 decompose into 7 independent CRT channels. mod 8 = sub-bass (20-60Hz, room tone). mod 9 = bass (60-250Hz, voice fundamental). mod 25 = midrange (250-2kHz, vowel formants). mod 49 = presence (2-6kHz, consonant detail). mod 11 = brilliance (6-12kHz, sibilance). mod 13 = air (12-20kHz, breathiness). mod 17 = ultrasonic (20kHz+, inaudible harmonics). Per-channel quantization: each channel quantized independently to floor(bits/quant_level) bits. Reconstruction via CRT. Quality degrades gracefully per-channel. mod-11 error concealment: lost frames reconstructed by neighbor interpolation within the mod-11 residue class. Small modulus = high interpolation accuracy. 3+4 data/parity split: data channels {mod 8, mod 25, mod 49} = speech CONTENT (formants, pitch). Parity channels {mod 9, mod 11, mod 13, mod 17} = speech IDENTITY (speaker, prosody). Low-bitrate voice recognition: quantize data aggressively, preserve parity.
7 spectral bands
CRT channels
Each channel captures one frequency range independently. No cross-band leakage.
Per-channel quantize
Independent
Reduce bits in each channel separately. Graceful degradation. No global bit allocation.
mod-11 concealment
Free ECC
Lost packets recovered by neighbor interpolation. Small-modulus channels have highest recovery rate.
3+4 split
Content vs identity
Data channels = formants (what is said). Parity channels = speaker (who says it). Selective compression.

Codec Analysis

Speaker ID:

Play Spectral Bands synthesizes 7 oscillators at the CRT band center frequencies, with gain derived from each channel residue. 40-frame speech utterance (800ms at 20ms/frame). Three quantization levels compared.

Packet Loss Concealment

25% packet loss: every 4th frame dropped. Lost frames concealed by per-channel neighbor interpolation. Small-modulus channels recover best (closest interpolation). No explicit FEC overhead.

Batch Codec Test

8 speakers x 30 frames each. Distortion measured at Q=1 (lossless), Q=2 (50% bit reduction), Q=4 (75% bit reduction). CRT guarantees across all 7 channels: per-channel quantization error stays within that channel. No cross-band artifacts.

CRT vs Traditional Speech Codecs

TransformOpus/AAC: MDCT (modified discrete cosine transform, patented variants)CRT: 7 independent channel residues. No transform matrix. Integer arithmetic.Bit allocPsychoacoustic model allocates bits per band (patented)Per-channel: each mod quantized independently. No model needed.Packet lossOpus: SILK/CELT hybrid FEC (complex, patent-adjacent)mod-11 neighbor interpolation: small modulus = high recovery. Free from algebra.BandsTypically 18-64 bands (empirical, tuned)7 bands = 7 CRT channels. Algebraically independent. Not tuned.ComputeFFT/MDCT + entropy coding + rate controlModular arithmetic. 7 mod operations per frame. Integer only.Patent statusFraunhofer (AAC/MP3), various (EVS/3GPP), IETF (Opus is open but complex)CC0. Public domain. Forever.

Source code · Public domain (CC0)

Report issue

.ax source compiled to WASM via self-hosting compiler. Zero HTML authored.