CRT Significance Testing

Z/21 = Z/3 x Z/7

CRT splits one question into algebraically independent channels. Z/21 = Z/3 x Z/7: two independent test channels. The evidence tuple: per-channel p-values plus R-squared reconstruction. The data's index set determines the test design. No parameters. No model fitting.

The Data: 21 Amino Acids in Z/21

Kyte-Doolittle hydrophobicity scale for 20 amino acids + Stop, indexed 0-20 alphabetically. Z/21 = Z/3 x Z/7 gives each index two independent coordinates: mod-3 channel and mod-7 channel. Groups are equal-sized: 21/3 = 7 per mod-3 group, 21/7 = 3 per mod-7 group.

IndexAAHydromod 3mod 7
0A+1.800
1C+2.511
2D-3.522
3E-3.503
4F+2.814
5G-0.425
6H-3.206
7I+4.510
8K-3.921
9L+3.802
10M+1.913
11N-3.524
12P-1.605
13Q-3.516
14R-4.520
15S-0.801
16T-0.712
17V+4.223
18W-0.904
19Y-1.315
20*+0.026

CRT Channel Decomposition as Test Framework

CRT Test Theorem
Given n data points in Z/N = Z/p1 x Z/p2 x ... x Z/pk, each CRT channel defines an independent grouping. Test statistic H = sum of squared group sums (higher = more between-group separation). Permutation null: shuffle data across indices. Per-channel p-value: fraction of permutations with H >= observed. Joint p-value: fraction where ALL channels simultaneously exceed. CRT guarantees algebraic independence between channels.
No parameters
Data chooses ring
21 amino acids live in Z/21. No embedding parameter. The ring is the test structure.
Two channels
mod 3, mod 7
mod-3 sorts into 3 groups of 7. mod-7 sorts into 7 groups of 3. Independent by CRT.
Joint power
Multiplicative
If channels are independent, p_joint approaches p_3 * p_7. Two weak signals combine.
Embedding
Open question
Alphabetical index = one possible mapping. Molecular weight, codon order = alternatives to test.

The Evidence Tuple

Evidence Tuple
Given n data points with response values indexed 0..n-1 in ring Z/N = Z/m1 x ... x Z/mk, the evidence tuple is E(data, Z/N) = { p1, ..., pk, R-squared, E[R-squared|null] }. Each p_c is the permutation p-value for channel c. R-squared = reconstruction convergence: predict each value from its CRT channel group means. E[R-squared|null] = 1 - product(1 - (m_c - 1)/(n - 1)) = expected R-squared under random assignment.
Per-channel p
Which axis?
A single p-value masks which channels carry the signal. Per-channel p-values reveal the algebraic structure of the effect.
R-squared
How much?
Predict each value from CRT group means: y_hat = mean(mod-3 group) + mean(mod-7 group) - grand mean. R-squared = 1 - SS_residual / SS_total.
Null baseline
Overfitting guard
With k channels of moduli m1..mk, random data gives R-squared around E[R-squared] = 1 - product(1 - (m_c-1)/(n-1)). For Z/21: 37%. Only excess above this is real signal.
Joint test
CRT independence
The joint p-value (fraction where ALL channels exceed simultaneously) handles multiplicity directly. CRT proves channel independence algebraically -- no assumption needed -- so the exact correction 1-(1-alpha)^k applies.
Power curve
Built-in control
Non-signal channels stay at ~5% (noise) while the signal channel rises. CRT channel independence gives a free negative control per channel -- no separate control group needed.

Run the Permutation Test

500 random permutations. For each, shuffle hydrophobicity values across the 21 indices, compute H for mod-3 and mod-7 groupings. Count how many exceed the observed values. Try different seeds to verify stability.

Seed:

Synthetic Validation: Does CRT Find Real Signals?

The acid test: generate data WITH known channel structure, and WITHOUT. CRT should detect the first and miss the second. Three scenarios per run: (1) signal in mod-3 channel only, (2) signal in mod-7 channel only, (3) pure noise. 20 trials each, 50 permutations per trial.

Seed:

Power Curve: Detection vs Effect Size

How strong must a signal be before CRT detects it? Sweep signal strength from 0 (noise) to 30 (strong). At each level, inject signal into the mod-3 channel only and run 15 trials. The S-shaped power curve shows the transition from chance (~5%) to reliable detection (~100%). The non-signal channel (mod-7) acts as a built-in negative control.

Seed:

Your Data: Test Any 21 Values

Enter 21 comma-separated integers. The CRT test decomposes your data into Z/3 x Z/7 channels and runs the same permutation analysis. The default values are the amino acid hydrophobicity data above.

21 values (comma-separated):

What This Tests

The traditional approach tests one correlation (eigenvalue vs hydrophobicity, r=0.409, p=0.073 after investigation). The CRT approach enriches this with a structured evidence tuple: per-channel p-values reveal WHICH algebraic axis carries the signal, and R-squared measures how much the decomposition explains.

The evidence tuple provides structured information beyond a single p-value. Per-channel p-values show which channels detect the signal. R-squared above the null baseline (37% for Z/21) shows genuine explanatory power beyond what grouping alone provides. The joint p-value handles multiplicity directly: it measures the fraction of permutations where ALL channels exceed simultaneously.

CAVEAT: The amino acid indexing is alphabetical by 1-letter code. This is one possible ring embedding. Other orderings (molecular weight, codon position) give different channel assignments. The method works for ANY embedding -- the question is which reveals the most structure.

The power curve shows CRT's minimum detectable effect size empirically. Channel independence means each non-signal channel is a free negative control -- no separate control group needed. This structural advantage grows with the number of CRT channels.

Source code · Public domain (CC0)

Report issue

.ax source compiled to WASM via self-hosting compiler. Zero HTML authored.