CRT Significance Testing

Z/21 = Z/3 x Z/7

CRT splits one question into algebraically independent channels. Z/21 = Z/3 x Z/7: two independent test channels. The evidence tuple: per-channel p-values plus R-squared reconstruction. The data's index set determines the test design. No parameters. No model fitting.

The Data: 21 Amino Acids in Z/21

Kyte-Doolittle hydrophobicity scale for 20 amino acids + Stop, indexed 0-20 alphabetically. Z/21 = Z/3 x Z/7 gives each index two independent coordinates: mod-3 channel and mod-7 channel. Groups are equal-sized: 21/3 = 7 per mod-3 group, 21/7 = 3 per mod-7 group.

Index	AA	Hydro	mod 3	mod 7
0	A	+1.8	0	0
1	C	+2.5	1	1
2	D	-3.5	2	2
3	E	-3.5	0	3
4	F	+2.8	1	4
5	G	-0.4	2	5
6	H	-3.2	0	6
7	I	+4.5	1	0
8	K	-3.9	2	1
9	L	+3.8	0	2
10	M	+1.9	1	3
11	N	-3.5	2	4
12	P	-1.6	0	5
13	Q	-3.5	1	6
14	R	-4.5	2	0
15	S	-0.8	0	1
16	T	-0.7	1	2
17	V	+4.2	2	3
18	W	-0.9	0	4
19	Y	-1.3	1	5
20	*	+0.0	2	6

CRT Channel Decomposition as Test Framework

CRT Test Theorem

Given n data points in Z/N = Z/p1 x Z/p2 x ... x Z/pk, each CRT channel defines an independent grouping. Test statistic H = sum of squared group sums (higher = more between-group separation). Permutation null: shuffle data across indices. Per-channel p-value: fraction of permutations with H >= observed. Joint p-value: fraction where ALL channels simultaneously exceed. CRT guarantees algebraic independence between channels.

No parameters

Data chooses ring

21 amino acids live in Z/21. No embedding parameter. The ring is the test structure.

Two channels

mod 3, mod 7

mod-3 sorts into 3 groups of 7. mod-7 sorts into 7 groups of 3. Independent by CRT.

Joint power

Multiplicative

If channels are independent, p_joint approaches p_3 * p_7. Two weak signals combine.

Embedding

Open question

Alphabetical index = one possible mapping. Molecular weight, codon order = alternatives to test.

The Evidence Tuple

Evidence Tuple

Given n data points with response values indexed 0..n-1 in ring Z/N = Z/m1 x ... x Z/mk, the evidence tuple is E(data, Z/N) = { p1, ..., pk, R-squared, E[R-squared|null] }. Each p_c is the permutation p-value for channel c. R-squared = reconstruction convergence: predict each value from its CRT channel group means. E[R-squared|null] = 1 - product(1 - (m_c - 1)/(n - 1)) = expected R-squared under random assignment.

Per-channel p

Which axis?

A single p-value masks which channels carry the signal. Per-channel p-values reveal the algebraic structure of the effect.

R-squared

How much?

Predict each value from CRT group means: y_hat = mean(mod-3 group) + mean(mod-7 group) - grand mean. R-squared = 1 - SS_residual / SS_total.

Null baseline

Overfitting guard

With k channels of moduli m1..mk, random data gives R-squared around E[R-squared] = 1 - product(1 - (m_c-1)/(n-1)). For Z/21: 37%. Only excess above this is real signal.

Joint test

CRT independence

The joint p-value (fraction where ALL channels exceed simultaneously) handles multiplicity directly. CRT proves channel independence algebraically -- no assumption needed -- so the exact correction 1-(1-alpha)^k applies.

Power curve

Built-in control

Non-signal channels stay at ~5% (noise) while the signal channel rises. CRT channel independence gives a free negative control per channel -- no separate control group needed.

Run the Permutation Test

500 random permutations. For each, shuffle hydrophobicity values across the 21 indices, compute H for mod-3 and mod-7 groupings. Count how many exceed the observed values. Try different seeds to verify stability.

Seed:

Synthetic Validation: Does CRT Find Real Signals?

The acid test: generate data WITH known channel structure, and WITHOUT. CRT should detect the first and miss the second. Three scenarios per run: (1) signal in mod-3 channel only, (2) signal in mod-7 channel only, (3) pure noise. 20 trials each, 50 permutations per trial.

Seed:

Power Curve: Detection vs Effect Size

How strong must a signal be before CRT detects it? Sweep signal strength from 0 (noise) to 30 (strong). At each level, inject signal into the mod-3 channel only and run 15 trials. The S-shaped power curve shows the transition from chance (~5%) to reliable detection (~100%). The non-signal channel (mod-7) acts as a built-in negative control.

Seed:

Your Data: Test Any 21 Values

Enter 21 comma-separated integers. The CRT test decomposes your data into Z/3 x Z/7 channels and runs the same permutation analysis. The default values are the amino acid hydrophobicity data above.

21 values (comma-separated):

What This Tests

The traditional approach tests one correlation (eigenvalue vs hydrophobicity, r=0.409, p=0.073 after investigation). The CRT approach enriches this with a structured evidence tuple: per-channel p-values reveal WHICH algebraic axis carries the signal, and R-squared measures how much the decomposition explains.

The evidence tuple provides structured information beyond a single p-value. Per-channel p-values show which channels detect the signal. R-squared above the null baseline (37% for Z/21) shows genuine explanatory power beyond what grouping alone provides. The joint p-value handles multiplicity directly: it measures the fraction of permutations where ALL channels exceed simultaneously.

CAVEAT: The amino acid indexing is alphabetical by 1-letter code. This is one possible ring embedding. Other orderings (molecular weight, codon position) give different channel assignments. The method works for ANY embedding -- the question is which reveals the most structure.

The power curve shows CRT's minimum detectable effect size empirically. Channel independence means each non-signal channel is a free negative control -- no separate control group needed. This structural advantage grows with the number of CRT channels.