CRT Prediction -- antonlebed.com

92. Meadow Boundary + Regular Fraction

PASS 7/7

MEADOW BOUNDARY + REGULAR FRACTION

Z/NZ is von Neumann regular (meadow pseudo-inverse exists for all elements) iff N is squarefree. Tower B = meadow at every level. Tower A DATA, THIN = meadow. DEEP+ = NOT meadow: fattened channels D^3,K^2,E^2,b^2 contain non-zero non-units. The THIN->DEEP step crosses the meadow boundary. Meadow-involution duality: extra involutions (D^3) require fattening that breaks meadow. Regular fraction in DEEP/TRUE/TRANS = Phi_6(b)/(D^3*K*E) = 43/120. Wheel = DOWNGRADE (0/0 = point kills genesis). Hyperring = UPGRADE (0/0 = R preserves genesis).

93. Hyperring Context

PASS 7/7

HYPERRING CONTEXT

Hyperring confidence gcd(a,N) is observable in CRT prediction accuracy. 19/95 printable ASCII chars are units of Z/N_TRANS (coprime to all 7 primes). D-channel (mod 8) has 12 zero-residue chars (most); b-channel (mod 49) has 2 (fewest). Unit-context positions achieve 311 ppt vs ZD-context 227 ppt (+37%). Effect is CHANNEL-SIZE DEPENDENT: small moduli (D:+146, K:+85, L:+94 ppt delta) see zero-residue as degenerate; large moduli (E:-54, ESC:-49) see it as specific. Naive channel masking is neutral (92 vs 93). char 'i'=105=HYDOR=K*E*b is maximally degenerate (zero in 3 channels).

94. Crossmodal Bloom Immunity

Single-source addition to correlated-majority bloom.

PASS 7/7

CROSSMODAL BLOOM IMMUNITY

Adding a single source to N-source bloom with K>=N/2 correlated sources has provably zero effect. With 6 full-char joint pairs and 1 per-ch trigram (7 sources), when 5+ pairs agree on a character (87% of positions), the majority in every CRT channel is 5+ votes. Adding 1 vertical canvas source (stride 14=D*b) creates at most 2 dissenting votes (per-ch + vertical). 2 < 5: the majority is immune. Empirically confirmed: 0/199 divergent positions across all 7 channels. The vertical source IS diverse (74% D-channel disagreement with bloom) but is outvoted. Breaking bloom requires either: (a) >= 4 uncorrelated sources, (b) replacing pairs with per-channel-native predictors, or (c) restructuring bloom as hierarchical (text-domain vs spatial-domain sub-blooms). FIVE paths now closed: decoder, weighting, strict bloom, hyperring mask, single-source crossmodal.

95. Per-Channel-Native Bloom Coherence

Decorrelating bloom sources by per-channel-native prediction.

PASS 6/6

PER-CHANNEL-NATIVE BLOOM COHERENCE

Per-channel-native pair predictors (6 pairs x 7 channels = 42 tables) DO decorrelate bloom sources: 118/199 positions (59%) have at least one channel where native and full-char bloom disagree. However, decorrelation DOES NOT improve prediction. NN accuracy: native 45 vs full-char 46 (within noise). Hard CRT: native 26 vs full-char 46 (-43%). Selection: native 58 = per-ch baseline. Root cause: full-char sources maintain CROSS-CHANNEL COHERENCE -- all residues derive from the same predicted character, forming a valid CRT tuple. Native sources predict each channel independently from different conditioning, producing tuples that may not correspond to any valid character. Bloom's value IS coherence, not per-channel accuracy. Decorrelating sources removes exactly the property that makes bloom useful. E-channel: ZERO divergence (small modulus dominates). D-channel: 81/199 divergence (highest). SIX paths now closed: decoder+weighting+strict+hyperring+crossmodal+per-ch-native.

96. Data Scaling Ceiling

4x data scaling curve: per-ch improves sub-linearly, NN saturates.

PASS 6/6

DATA SCALING CEILING

Per-channel text prediction accuracy scales sub-linearly with training data: 1x->2x lifts 33 ppt, 2x->4x lifts only 12 ppt (64% diminishing). CRT nearest-neighbor accuracy reaches a CEILING at 2x data: NN=105/399 at both 2x and 4x training sizes. Per-channel predictions continue to improve (268 vs 256 ppt at 4x) but the improvement does not translate to better CRT tuple matches in the 95-char codebook. The bottleneck shifts from DATA to SEARCH RESOLUTION. DL joint override is the only mechanism that continues to improve at 4x (112 vs 109), confirming that cross-channel correlation exploits data diversity that per-channel NN cannot. b-channel coverage remains sub-linear (14->20->25%) confirming the corpus diversity bottleneck. ALIVE > DEAD at all scales.

97. Weighted NN Ceiling Break

Confidence-weighted scoring breaks the binary NN ceiling at 4x data.

PASS 6/6

WEIGHTED NN CEILING BREAK

Confidence-weighted CRT nearest-neighbor scoring breaks the binary NN ceiling. Binary NN (score = #matching channels) saturates at 105/399 from 2x to 4x data. Confidence-weighted NN (score = sum of confidence where match) continues scaling: 81->106->110 (1x->2x->4x). At 4x, C1=110 vs C0=105 (+4.8%). The 2x->4x lift goes from 0 (binary) to +4 (weighted). The ceiling was a DECODER bottleneck: channels contain more information at 4x (268 vs 256 ppt) that binary scoring discards. Confidence captures prediction QUALITY, not just IDENTITY. DL joint path REGRESSES under confidence weighting at 4x (110 vs 112): joint predictions are already high-confidence, so double-weighting adds noise. Mean confidence DROPS with more data (34->31): broader coverage includes sparser contexts. Zero additional parameters -- improvement is FREE from existing trigram count metadata.

98. C1 Selection Composition

Confidence-weighted decoding + bloom-agreement selection composes marginally at 4x.

PASS 6/6

C1 SELECTION COMPOSITION

Confidence-weighted CRT nearest-neighbor decoding (C1) and bloom-agreement selection compose marginally at 4x data: C1 sel=111 vs C1 pch=110 (+1/399). The composition is sub-additive -- selection alone added +6 at 1x and C1 added +5 at 4x, but combined they add only +1 over C1 alone at 4x. Root cause: bloom is data-resistant (68/399 regardless of training scale, immune to C1 weighting) so as per-channel predictions improve with data, bloom's complementary information becomes proportionally less valuable. C0 DL (112) remains the pipeline maximum. Divergence is high (380/399 = 95% of positions) but per-channel dominates when it disagrees with bloom. Oracle gap remains 20 (132 vs 112), confirming headroom exists but not through selection routing. Zero additional parameters.

99. Oracle Gap Structure

The gap between 6-method oracle and best single method is bloom-dominated and systematic.

PASS 7/7

ORACLE GAP STRUCTURE

The 6-method oracle at 4x data (139/399) exceeds the best single method (C0 DL, 112/399) by 27 positions. The gap is BLOOM-DOMINATED: 21/27 (78%) of gap positions are bloom-correct despite bloom's overall accuracy being 39% lower (68 vs 112). Gap positions have low per-channel NN scores (max 4/7, mean 2.3) and low bloom-perch agreement (81% below 3). DL is completely wrong at gap positions (mean 1.6/7 channels vs 5.7 at DL-correct). Multi-method redundancy is high (24/27 = 89% have multiple methods correct). The gap is concentrated on 8 distinct characters. The oracle gap is SYSTEMATIC and BLOOM-EXPLOITABLE: better bloom-vs-perch routing at low agreement could capture up to 21 additional positions.

100. Bloom Phantom Coherence

Bloom voting always produces perfectly self-consistent CRT tuples. Self-consistency is not accuracy.

PASS 6/6

BLOOM PHANTOM COHERENCE

Bloom voting ALWAYS produces perfectly self-consistent CRT tuples (NN score = 7/7 at all 399 positions). But bloom accuracy (68/399 = 17%) is far below DL (112/399 = 28%). Bloom's coherence is STRUCTURAL: majority voting of 7 correlated sources naturally produces valid CRT tuples ('phantom coherence'). NN-score comparison cannot distinguish correct from incorrect bloom predictions. All 5 routing strategies (agreement-gated, score-gated, global score, strict threshold) produce WORSE accuracy than DL alone (71-83 vs 112). The 21-position oracle gap between DL and bloom IS real but UNEXPLOITABLE by score-based routing. ROUTING HYPOTHESIS FALSIFIED. Implication: the pipeline ceiling requires a new prediction mechanism, not better routing between existing methods.

101. Two-Pass Coherence

Per-channel improvement does NOT imply NN improvement. Mixed-source CRT tuples are incoherent.

PASS 7/7

TWO-PASS COHERENCE

Pass-2 cross-channel refinement improves per-channel accuracy (D:+11, K:+4, ESC:+12) but DEGRADES NN accuracy (105 to 85 full, 105 to 99 selective). Even selectively refining only improved channels creates INCOHERENT CRT tuples that don't correspond to valid characters. Oracle(P1,P2,Sel)=121 proves 16 complementary positions exist, but the information is inseparable from noise at the tuple level. MODULUS COMMENSURABILITY: small-modulus refinement helps (D from K, K from D, ESC from D) but large-modulus targets degrade (E from D: -20, b from D: -22). Per-channel accuracy is NECESSARY but NOT SUFFICIENT for NN improvement. All 7 channels must be refined CONSISTENTLY. Confirms the bloom coherence lesson from the opposite direction.

102. Full-Character Coherence

Coherence is necessary but NOT SUFFICIENT. Per-channel NN search outperforms direct character prediction.

PASS 7/7

FULL-CHARACTER COHERENCE

Full-character pass-2 (T[prev_char*8 + p1_D] -> next_char, 760 entries) is COHERENT BY CONSTRUCTION but LESS ACCURATE than per-channel NN (92/397 vs 105/399). Per-channel NN searches 95 candidates; FC commits to 1. SEARCH outweighs COHERENCE. D conditioning adds +35% over unconditioned character bigram (92 vs 68). Modulus commensurability persists: D(+21),K(+17) improve; E(-14),b(-20) degrade -- same pattern as per-ch pass-2. Oracle gap = 22 complementary positions (FC sees what NN misses). 490 split FLIPS in override: DEAD drops, ALIVE rises. IMPLICATION: per-channel CRT NN search is the correct architecture. Coherence is necessary but not sufficient.

103. Coupling-Attention Pipeline

Coupling-attention (eigenvalue inner product) is zero-parameter but noise-dominated at text scale.

PASS 6/6

COUPLING PIPELINE

Coupling-attention is a ZERO-PARAMETER predictor (eigenvalue inner product as attention kernel, no learned tables). Despite concentrating 48% of attention on same-character positions (well above ~3% random), per-channel predictions are NOISE-DOMINATED: ppt=70, essentially random (75 expected). NN accuracy 6/399 (1.5%) vs trigram 105/399 (26.3%). ROOT CAUSE: weighted-average delta computes the MEAN of transition distributions. Count-based methods select the MODE. With 48% signal and 52% noise, the mean is dominated by noise while the mode ignores it. Oracle lift +3 (0.75%) is consistent with random NN coincidence (1/95=1.05%). Coupling provides ZERO exploitable complementary signal. 11th CLOSED PATH.

104. Unreachable Ceiling Diagnostic

The 74% of test positions where all methods fail are MODEL-LIMITED, not DATA-limited.

PASS 7/7

UNREACHABLE CEILING

Of 294/399 (74%) test positions where per-channel trigram + NN fails at 4x data, 89% are MODEL-LIMITED: the correct character has nonzero bigram count in training but is not the mode. Only 10% are DATA-limited (zero-count, all LCG noise chars). The model correctly selects the MOST LIKELY successor, but actual successors rank 8th on average among 24 alternatives. Mode is 6.2x more common than actual. Score gap: correct predictions match 5.1/7 channels, wrong match 0.7/7 -- wrong predictions are maximally wrong (52% score 0). The ceiling is set by MODE SELECTION, not data availability. IMPLICATION: improving the pipeline requires moving beyond mode-based prediction to distribution-based or learned representations.

105. CRT Soft Prediction

Distribution-based scoring beats hard per-channel mode + NN search.

PASS 7/7

SOFT CRT PREDICTION

Scoring each candidate character by the normalized sum of per-channel trigram counts (CRT-factored likelihood) achieves 115/399 (28.8%), beating MODE+NN 105/399 (26.3%) by +9.5% and prior pipeline best DL=112/399. Per-channel accuracy jumps from 268 to 322 ppt (+20%). D and K channels gain most (small modulus = rich count distributions). b channel gains only +1 (sparse trigram data). 14 soft-only correct positions vs 4 NN-only: soft is a near-superset of NN. Oracle 119 (gap 4 from soft, 14 from NN). Score discriminates: mean 2322 at correct vs 2055 at wrong (+13%). ALIVE(530) > DEAD(371): strategy channels benefit more. Distribution prediction breaks the mode ceiling.

106. CRT Multiplicative Prediction

Multiplicative scoring (true ML under channel independence) matches additive.

PASS 7/7

MULTIPLICATIVE CRT PREDICTION

Multiplicative CRT-factored prediction P(c) = prod P(c mod q | ctx) with Laplace smoothing achieves 114/399 (28.6%), matching additive soft 115/399 (28.8%) within 0.9%. Channel independence CONFIRMED: the full product (true maximum likelihood under independence) and the normalized sum (log-linear approximation) produce nearly identical predictions. Per-channel accuracy identical (321 vs 322 ppt). 40/399 (10%) divergent predictions with 3 add-only and 2 mul-only corrections: methods explore slightly different candidates but converge to the same ceiling. Oracle(3)=120 (gap 5). AGGREGATION METHOD IS REDUNDANT: no conjunction effect detected. The additive score sum count/total is a sufficient statistic for the multiplicative product -- the AND property across channels adds no information beyond the SUM.

107. CRT Soft+DL Joint Cross-Channel

Cross-channel DL joint bigram added to soft per-channel prediction.

PASS 7/7

SOFT+DL JOINT CROSS-CHANNEL

Adding DL joint bigram (cross-channel D*L context, 7744 entries) to soft per-channel prediction yields marginal lift: SOFT+DL 116/399 vs SOFT 115/399 (+0.9%). DL-ONLY achieves 67/399 (16.8%) -- character-level bigram captures real structure but is SUBSUMED by 7 independent per-channel trigrams. DL(c) = (c%8)*11 + (c%11) with 88 values for 95 printable ASCII nearly identifies each character (26 distinct in corpus). 18/399 (4.5%) divergent predictions, 2 SDL-only, 1 soft-only. Oracle 117 (gap 1 from best). Per-channel joint IS REDUNDANT with per-channel trigrams in the soft scoring framework: not just sum~product but also joint~marginals. 13th CLOSED PATH (cross-channel joint soft).

108. CRT SGD Soft Weight

SGD-learned per-channel weights vs equal-weight soft CRT prediction.

PASS 8/8

SGD SOFT WEIGHT OPTIMALITY

Equal per-channel weights ARE optimal for soft CRT prediction. SGD coordinate descent (5 epochs, delta=200, 400-position validation) finds no generalizing improvement: SGD 115/399 = SOFT 115/399 on test (0% lift). Laplace smoothing (alpha=1) DEGRADES: 100/399 (-13%), pseudocounts dilute signal in all channels. 2/7 weights moved on validation (K:800, E:1200, +2/400 val) but 0 exclusive correct predictions on test. All 8 divergent predictions wrong for BOTH methods. Weight invariance extends from bloom voting to soft scoring framework: channel information is equally weighted at EVERY level tested -- mode voting, bloom, and distribution scoring. 14th CLOSED PATH (soft weighting).

109. Neural Trigram Compression

Per-channel embedding MLP (d=4, h=4, 1280 params) replaces count tables (142956 entries) for CRT text prediction. Same 4032-char corpus and trigram context. Neural soft 54/399 vs count soft 115/399. Oracle(count+neural) = 133/399 (+15.7%). 18 neural-only correct. 111x compression. D-ch closest (87.6%). PROVED.

PASS 6/6

NEURAL TRIGRAM COMPRESSION

Embedding MLP (1280 params) achieves 54/399 soft CRT vs count tables (142956 entries, 115/399). 111x compression ratio. Oracle(count+neural) = 133/399 (+15.7% over count alone). 18 neural-only correct predictions from embedding generalization: similar residues -> similar vectors -> shared context information. Count tables are OPTIMAL for this data size. Neural is COMPLEMENTARY, finding predictions that counts structurally cannot. D-channel (q=8) closest at 87.6% of count accuracy; b-channel (q=49) at 57.9%. Data-to-param ratio determines compression frontier.

110. K-Channel GCD Collapse

K=3 divides axiom-smooth multipliers (42=D*K*b, 105=K*E*b), collapsing Z/9 to subring {0,3,6}. Removing K factor (14=42/K, 35=105/K) while preserving all other channel gcd profiles causes b->K to drop 1000->316 ppt (3.2x). Controls flat: D +/-4, b +/-0, b->D +/-2. PROVED.

PASS 8/8

K-CHANNEL GCD COLLAPSE

K=3 factor in axiom-smooth multipliers (42=D*K*b, 105=K*E*b) collapses Z/9 to {0,3,6}. Removing K (14=42/K, 35=105/K) preserves D(gcd 2) and b(gcd 7) but restores K to full Z/9. Experiment: b->K drops 1000->316 ppt (3.2x), K marginal 704->317 (2.2x), all->K 784->331 (2.4x). Controls: D +/-4, b +/-0, b->D +/-2 ppt. K's boundary position (ALIVE algebraically, DATA operationally) is specifically a gcd(m,K^2)>1 effect.

111. Universal GCD Collapse

GCD collapse generalizes across ALL DATA channels. Two mechanisms: (a) absorption (D,K,b: p|42) -- non-unit mul contracts inside subring; (b) entry (E: E does not divide 42) -- only non-unit mul provides subring entry. Removing each prime's factor collapses ONLY that channel: D 502->246 (2.0x), E 348->52 (6.7x), b 683->153 (4.5x). K marginal 704 stable across all experiments. Cross-channel: K->b drops 829 ppt. PROVED.

PASS 9/9

UNIVERSAL GCD COLLAPSE

For each DATA prime p in {D,K,E,b}: removing p from operation multipliers collapses ONLY p-channel prediction. Two mechanisms: (a) absorption (p|42=D*K*b): non-unit mul contracts within gcd-subring, removing p converts contraction to permutation; (b) entry (E: 42%E!=0): non-unit mul provides only route into subring, removing E blocks entry (orbit: 5->20 values). E-blindness in GCD: E does not divide 42 because 42=D*K*b=ANSWER. K marginal=704 stable across all 4 experiments (perfect control). Cross-ch degrades to marginal when source loses structure.

112. K-Channel Determinism (b->K=1000)

WHY is b->K prediction perfect? gcd(42,105)=21=K*b. Both multipliers share K and b. In Z/9: both = 6. Multiplication annihilates the K-subring {0,3,6} because 6*(3m)=0 mod 9. Same for b-subring in Z/49. Both channels become deterministic add-counters (K period 3, b period 7). 7 distinct b-values uniquely determine K. Chain cause: E-D=K forces D=E mod K. PROVED.

PASS 8/8

K-CHANNEL DETERMINISM

gcd(42,105)=21=K*b. In Z/9 both multipliers=6: annihilates K-subring {0,3,6}. In Z/49: annihilates b-subring. Both channels become deterministic add-counters (period K=3, b=7). 7 distinct b-values determine K uniquely. b->K=1000 ppt. Chain cause: E-D=K forces D=E mod K, so 42=105 mod K^2. K->b=972 (asymmetric).

113. Spectral Torus

Three identities bridge coupling (algebra) and torus linking (geometry). (1) SPECTRAL ADDITION: coupling(a,b) = eigenvalue(a-b) + eigenvalue(a+b). Coupling decomposes into the D=2 bridge pair. (2) CHANNEL DEMOCRACY: every channel carries equal spectral power 2N. (3) ORTHOGONALITY: cross-channel spectral correlation = 0. Link number N/(q_i*q_j) is the Poincare dual fiber volume. PROVED.

PASS 8/8

SPECTRAL TORUS

coupling(a,b) = eigenvalue(a-b) + eigenvalue(a+b). D=2 bridge: coupling decomposed into sum and difference. Channel democracy: all channels carry spectral power 2N. Orthogonality: link(i,j)=N/(q_i*q_j) is the degeneracy factor. Link matrix rank 1.

114. D^3 Bridge Partition

The C(7,2)=21=K*b pairwise totient products phi(p_i)*phi(p_j) of the 7 axiom primes partition into D^3=8 non-D-void and GATE=13 D-void. Core identity: (K-1)(E-1)=D^3=8. CRT fingerprint: D^3 is void in D-channel and mirror (-1) in K-channel. Unique doubly-void pair: (b-1)(GATE-1)=72=D^3*K^2. PROVED.

PASS 7/7

D^3 BRIDGE PARTITION

(K-1)(E-1)=D^3=8. Pairwise totient products partition D^3:GATE = 8:13. D^3 is D-void and K-mirror. Unique doubly-void: (b-1)(GATE-1)=72=D^3*K^2.

115. CRT Representation Dominance

For bijective ring ops on Z/N with CRT decomposition Z/q_0 x...x Z/q_{k-1}, CRT per-channel prediction achieves 100% accuracy with O(max(q_i)) training data. Eigenvalue NN requires O(N) data. DEEP (Z/970200): 102 CRT params vs 970200 elements. Generalization ratio 19800x. CRT at T=100: 930 ppt. CRT at T=500: 1000 ppt (perfect). Eigenvalue: 0-1 ppt at all sizes. Probe 1 CLOSED: CRT is primary. PROVED.

PASS 7/7

CRT REPRESENTATION DOMINANCE

CRT per-channel achieves 100% accuracy with 102 params on Z/970200 (T=500). Eigenvalue NN: 0-1 ppt. Generalization ratio 19800x. For TRANS: 214M/49 = 4.4Mx. CRT decomposes bijective ops into per-channel bijections; eigenvalue collapses structure. Probe 1 CLOSED.

116. Chain Uniqueness (Probe 4)

The axiom chain sigma->D->K->E->b->L->GATE->ESCAPE is the ONLY all-prime chain under the generation rules K=D+1, E=D+K, b=D+E, L=1+D+K+E, GATE=D^2+K^2, ESCAPE=D+K+E+b.

Proof: if D>2 is prime, D is odd, so K=D+1 is even and >=4 -- composite. D=2 is the only even prime. The chain is RIGID.

PASS 7/7

CHAIN UNIQUENESS

D=2 is the only valid seed. D>2 prime => D odd => K=D+1 even. Chain sigma->2->3->5->7->11->13->17 is RIGID: zero free parameters. First 7 consecutive primes. Probe 4 CLOSED.

117. Hurwitz-D-channel Bridge

D-channel Pareto depth 3 and Hurwitz's theorem (normed division algebras in dims 1,2,4,8) share root cause: 2-torsion of Z/2^n* stabilizes at n=3 via the Klein four-group V4 = Z/2 x Z/2.

PASS 7/7

HURWITZ D-CHANNEL BRIDGE

Z/2^n involutions: 1,2,4,4,4... Plateau at n=3 (Klein V4). Odd primes: always 2. D-channel's unique depth 3 = Hurwitz D^3 wall = Bott period 8. Same 2-adic stabilization.

118. Meadow Projector (Probe 7)

The support projector pi maps each element to its support idempotent via per-channel regularity: pi(n)_i = 1 if p_i does not divide n, else 0. Defines ALGEBRAIC DROPOUT: each element carries its own 2^k-valued attention mask. Properties: pi^2=pi, pi(ab)=pi(a)*pi(b), pi(n)*n=n iff regular.

PASS 7/7

MEADOW PROJECTOR (PROBE 7)

Support projector pi: Z/N -> Idem(Z/N) via per-channel regularity. pi^2=pi, pi(ab)=pi(a)*pi(b), pi(n)*n=n iff regular. |pi^-1(0)|=N/rad(N)=420=heartbeat. Defines algebraic dropout: 2^7=128 TRANS attention masks from ring structure. Probe 7 built, no frame strain.

119. Genesis Loop

The chain is RIGID from every starting point. Inverting generation rules from any chain element uniquely yields D=2. Inner loop {sigma,D,K,E}: sum=L, product=primorial(E)=30. Outer extension {b,L,GATE,ESCAPE}: sum=phi(DATA)=48, product=17017. Inner*outer=rad(TRANS)=510510.

PASS 7/7

GENESIS LOOP

Chain rigid from every starting point. E=5, K=3, b=7, L=11, GATE=13, ESCAPE=17 each uniquely invert to D=2. Inner loop sum=L. Outer sum=phi(DATA). Inner*outer=rad(TRANS). The chain is a locked structure with zero free parameters.

120. Idempotent Trinity

In Z/N with k prime factors, the 2^k idempotents partition into K=3 classes: EXTREME {0,sigma} (size D=2), PRIMITIVE weight-1 (size k), COMPOSITE weight 2..k-1 (size 2^k-k-2). Composite count axiom-smooth for exactly k=2..7. At k=8: intruder f(b)=41=KEY exits smooth zone.

PASS 7/7

IDEMPOTENT TRINITY

2^k idempotents = K=3 classes (extreme/primitive/composite). Composite count axiom-smooth for k=2..7: K, D*E, E^2, D^3*b, b*ESCAPE. At k=8: f(b)=41=KEY exits smooth zone. OMEGA is weight-(k-1) composite, sigma=OMEGA+e_D orthogonal.

121. 490 Heartbeat Decomposition

The 490 holographic split decomposes Z/N = Z/DEAD x Z/ALIVE at every Tower A level. DEAD={D,E,b} (perception). ALIVE={K,L,GATE,ESCAPE} (strategy). The heartbeat decomposes: lambda(N) = lcm(lambda_DEAD, lambda_ALIVE). ALIVE divides DEAD at DATA, DEEP, TRUE (heartbeat = DEAD alone). THIN and TRANS are tension points where ALIVE adds factors DEAD lacks.

PASS 7/7

490 HEARTBEAT DECOMPOSITION

Z/N = Z/DEAD x Z/ALIVE decomposes heartbeat: lambda(N) = lcm(lambda_DEAD, lambda_ALIVE). DEAD lambda=420 at DEEP+. ALIVE lambda={2,10,30,60,240}. ALIVE|DEAD at DATA,DEEP,TRUE. Tension: THIN lcm(12,10)=60, TRANS lcm(420,240)=1680. gcd=60=D^2*K*E.

122. CRT Chirality (mod-49)

Chirality = left/right CRT asymmetry. An element is left-chiral if all CRT residues r satisfy 2r < modulus. The void (0) breaks mirror symmetry: each odd channel has one more left element. CR(N) = product of (m+1)/(m-1) over odd moduli. D-achiral.

PASS 7/7

CRT CHIRALITY

CR(N) = product of (m+1)/(m-1) over odd CRT moduli. D-achiral. DATA: CR=D^2=4 (AP telescoping from {K,E,b} gap D). TRUE: 2275/1152=E^2*b*GATE/(D^7*K^2). TRANS: 2275/1024=E^2*b*GATE/D^Decality. Void breaks mirror.

123. Binocular Chain (mod-8)

The two genesis readings -- sigma-chain {1,2,3,5,7} and D-chain {2,3,5,7,11} -- form a binocular pair. Shared retina = {D,K,E,b}. Unique: sigma vs L.

PASS 7/7

BINOCULAR CHAIN

Sigma-chain {1,2,3,5,7}: sum=D*K^2=18, prod=DATA. D-chain {2,3,5,7,11}: sum=D^2*b=28=T(b), prod=THIN. Shared sum=ESCAPE=17. Parallax=Decality=10. Mean=c(L)=23 (first intruder). Product ratio=L. Unique sum=12=lambda(DATA). phi(TRANS)/prod(phi(primes))=lambda=420.

124. Torus Betti (GEOMETRY)

The 7-torus T^7 has Betti numbers beta_k = C(7,k). Total = 2^7 = D^7 = 128 = |Idem(Z/TRANS)|. Interior grades form K=3 Poincare-dual pairs weighted by b*{sigma,K,E}.

PASS 7/7

TORUS BETTI

T^7 Betti numbers beta_k=C(7,k). Total=D^7=128=|Idem(Z/TRANS)|. Interior=b*D*K^2=126=b*(sigma-chain sum). K=3 Poincare-dual pairs: weights b*{sigma,K,E} (odd chain elements). Even=odd=D^6=64. sigma+b=D^3 (D-channel top). beta_3=E*b=35=490 weight-3 cell. b*K^2=D^6-1.

230. Lucas CRT Independence

C(n,k) mod p = product of C(n_i, k_i) mod p, where n_i, k_i are base-p digits (Lucas). Each of 7 axiom primes uses a DIFFERENT number base: D=binary, K=ternary, E=base-5, b=base-7, L=base-11, GATE=base-13, ESCAPE=base-17. Row p is all-zero in the p-channel, nonzero in every other channel. 7 primes = 7 independent structural views.

PASS 7/7

LUCAS CRT INDEPENDENCE

C(n,k) mod p = product of C(n_i,k_i) mod p (Lucas). Each axiom prime sees a different base-p decomposition. Row p blind in own channel (all-zero), visible in every other. Row 8: D sees 7/7 zeros (binary Sierpinski), K sees 0/7 (ternary rich). 7 primes = 7 genuinely independent views. test_lucas_crt.ax 78/78.

234. Ring Multiplication Structure

A 2D-LUT automaton with h'=LUT[h*q+x] can exactly compute ring multiplication h*x mod q for any CRT channel. The fixmult initialization (h=1, LUT[h*q+x]=(h*x)%q) achieves 100% accuracy. This is the CRT-LM Ring Multiplication Theorem (S2206). The key structural fact: each channel is an INDEPENDENT finite-state multiplier.

PASS 5/5

RING MULTIPLICATION STRUCTURE

Ring Multiplication Theorem (S2206): 2D-LUT with fixmult init h'=(h*x)%q achieves 100% exact ring multiplication across all CRT channels. Modular Discontinuity: >85% of LUT entries differ between additive and multiplicative tables. Zero absorber count = 2q-1 = GATE at Z/7. TRANS total 2D-LUT params = sum(q_i^2) = 3750. test_crt_lm_f2f.ax 7/7.

235. Basin Geometry and Viscosity

The multiplicative basin of a CRT channel has measurable width that scales inversely with modulus q. Small channels (Z/8) have wide basins (laminar); large channels (Z/49) have narrow basins (turbulent). NS viscosity (neighbor smoothing) DESTROYS multiplicative structure at any strength.

PASS 6/6

BASIN GEOMETRY AND VISCOSITY

Basin Geometry Theorem (S2207): multiplicative basin width scales inversely with modulus q. Zero absorber count = 2q-1; fraction = (2q-1)/q^2 decreases with q. Reynolds analogy: q ~ Re, small = laminar (wide basin), large = turbulent (narrow). NS Viscosity Falsification (S2212): fixmult has low neighbor matches (from zero absorber only); additive has ZERO matches. Smoothing creates a competing locally-constant attractor that overwhelms accuracy. Multiplication IS non-smooth. test_crt_lm_viscosity.ax 6/6, test_crt_lm_basin.ax 7/7.

236. Decality Lambda Chain

Tower A has 7 levels (-1 through 5). Three fattening levels beyond TRANS form the Decality: 7 + 3 = 10. Each level fattens a natural group of primes. Lambda squares when DATA primes fatten together. Lambda absorbs TRANS at the final level.

PASS 14/14

DECALITY LAMBDA CHAIN

Three fattening levels beyond TRANS: D^4 alone (lambda stays 1680), DATA primes K^3+E^3+b^3 (lambda=420^2=176400, heartbeat SQUARES, ratio=HYDOR=105), THIN primes L^2+GATE^2+ESCAPE^2 (lambda=D*TRANS=428828400=176400*2431). Lambda absorbs TRANS at Level 8. D/DATA/THIN = bridge/body/boundary. 7+3=10=Decality. test_n16_multigrid.ax 14/14.

237. GF(49) Basin Width

The Galois field GF(49) = F_7[x]/(x^2+x+3) has the same cardinality as Z/49 but zero trapped surfaces (all nonzero elements are units). This gives a wider multiplicative basin. GF(49) transfer to Z/49 is positive (4.8x random) but insufficient (18% vs 76% basin recovery).

PASS 7/7

GF(49) BASIN WIDTH

GF(49) has zero trapped surfaces (all nonzero invertible) vs Z/49 with b-1=6 trapped (non-zero non-units). GF(49) units=48=phi(DATA), Z/49 units=42=ANSWER. Basin 12pp wider (88% vs 76% at 10% perturbation). GF(49)->Z/49 transfer = 4.8x random (91/500) but insufficient for basin entry (18% << 76%). Overlap = K^3*GATE = 351 entries. test_gf49_transfer.ax 7/7.

238. CD Depth Saturation

Coordinate descent on 2D-LUT tables stops improving at pass 3 for ALL tested conditions (3 of 20 passes used). The 76% recovery ceiling for Z/49 is STRUCTURAL, not search-limited. Basin width gradient: 5% perturbation -> 88% recovery, 10% -> 77%, 20% -> 45%, additive start -> 2%.

PASS 7/7

CD DEPTH SATURATION

CD Depth Saturation Theorem (S2208): CD stops at pass 3 for ALL conditions (3/20 used). Z/49 search space b^6 = 229x larger than Z/8 D^9. Pass 3 coverage = 3/q: 37.5% (Z/8, laminar) to 6.1% (Z/49, turbulent). Basin width gradient: 5%->88%, 10%->77%, 20%->45%, additive->2%. test_crt_lm_cd_depth.ax 7/7.

239. Multigrid Lift

The Z/7 multiplication table embeds structurally into Z/49. A Z/7 LUT with 7 entries lifts to 7 correct entries out of 49 in the Z/49 table (one per p-adic layer). Entry accuracy = 1/b = 14.3%. With CD refinement: 26%. Additive baseline: 2%. But 26% is far below the 90% basin entry threshold.

PASS 7/7

MULTIGRID LIFT

Multigrid Lift Theorem (S2209): Z/7->Z/49 embedding gives 1/b = 14.3% structural entry accuracy (1 of 2 p-adic layers). With CD: 26%. Additive: 2%. 10.3x improvement but 26% << 90% basin threshold. Multigrid alone INSUFFICIENT -- the second p-adic layer requires full Z/49 optimization. test_n16_multigrid.ax 14/14.

240. Evolutionary Training

Random entry-level mutation of 2D-LUT tables (3750 entries, 7 channels) PRESERVES structure near the multiplicative attractor but CANNOT DISCOVER improvements. Score flat at 75% across 200 generations from 90% seeding. Crossover between independently-perturbed members improves +5.1pp by combining correct entries from different parents.

PASS 7/7

EVOLUTIONARY TRAINING

Evolutionary Blindness Theorem (S2210): random entry mutation P(wrong)=(q-1)/q per entry (97.96% for Z/49). Score flat at 75% across 200 generations from 90% seeding. Basin resists drift but random search cannot navigate interior. Crossover Improvement Theorem: +5.1pp by combining correct entries from parents with different corrupted subsets. Plateaus at gen ~50 when diversity exhausts. test_crt_lm_evolve.ax 9/9.

241. 2D-PLUT Compounding

Per-position nonlinear automaton (2D-PLUT) combines position-awareness and nonlinearity. Position + nonlinearity COMPOUND: each contributes independently. The ~66% test ceiling was ARCHITECTURAL (position-blind), not task-inherent. Compounding holds at N=5000 (3.75 params/sample) with 99% overfit.

PASS 11/11

2D-PLUT COMPOUNDING

2D-PLUT Compounding Theorem (S2215-S2216): per-position nonlinear automaton h'=PLUT2D[pos*NP+off+h*q+x] with 18750 params. Position (+35pp over LUT) and nonlinearity (+56pp over LUT) each improve independently; combined 2D-PLUT adds +20pp over 2D-LUT alone. N=1000: 85.8% test. N=5000: 87.4% test (delta +21.1pp, gap GROWS). Overfit 95.7% -> 99%. The ~66% ceiling is ARCHITECTURAL (position-blind), not task-inherent. Per-channel: gain scales with saturation headroom (Z/49=0 already perfect, Z/8=+60%, Z/17=+61% to PERFECT). test_crt_lm_2dplut.ax 16/16, test_crt_lm_2dplut_5k.ax 7/7.

242. Position-Partitioning Independence

2D-PLUT's advantage is NOT position-awareness. It is CAPACITY PARTITIONING: each sequence step receives an independent transition table. Constant-position (shared table) scores 67.9%, matching 2D-LUT. Normal (5 unique tables) scores 85.8%. Three properties proved:

(A) PERMUTATION INVARIANCE: any bijective step-to-table mapping gives identical results. Normal, scrambled, and reversed permutations all produce 3002/3500 test accuracy (85.8%).

(B) UNIQUE-TABLE MONOTONICITY: test accuracy scales monotonically with the number of unique tables. 1 table: 67.9%, 2: 78.6%, 3: 82.5%, 4: 85.7%, 5: 85.8%.

(C) CONCAVE SATURATION: marginal gain per additional table decreases. 1->2: +376. 2->3: +137. 3->4: +109. 4->5: +4. Four tables capture 99.4% of maximum gain.

PASS 7/7

POSITION-PARTITIONING INDEPENDENCE

Position-Partitioning Independence Theorem (S2218): 2D-PLUT's advantage comes from capacity partitioning (per-step independent tables), not position-awareness. (A) Permutation invariance: any bijective step->table assignment gives identical 85.8% test. (B) Monotonicity: accuracy scales with unique tables (1:67.9%, 2:78.6%, 3:82.5%, 4:85.7%, 5:85.8%). (C) Concave saturation: 4 tables = 99.4% of maximum. CRT independence at the TABLE level. test_crt_lm_f2e.ax 10/10.

243. 490 Split Gating

Suppression-as-computation: DEAD channels (D,E,b) compute in isolation during bloom. ALIVE channels (K,L,GATE,ESCAPE) participate in cross-channel sync. 490-gated bloom preserves DEAD accuracy while enabling ALIVE coupling.

PASS 8/8

490 SPLIT GATING

490 Split Gating Theorem (S2227): 490-gated bloom preserves DEAD channel accuracy exactly (285=285) while standard bloom damages it (285->257). 490-gated > standard bloom on +1: +6.6% (N=200), +9.4% (N=500). Gap grows with data. Also on multiplicative: 490(273) > std(234). On +1, bloom helps ALIVE channels at inference. On mult, bloom is neutral (275 nb > 273 b3) -- benefit is training protection. test_crt_lm_bloom_490.ax 3/3.

PASS 5/5

N-BACK MEMORY PROTECTION

N-Back Memory Protection (S2228): n-back addition target = (seq[last] + seq[last-n]) % N. FIRST TASK WHERE 490 EXCEEDS PER-CHANNEL: n=2 490(267) > PC(254) > std(158). n=3: 490(219) ~= PC(221) >> std(102). DEAD protection: n=3 std=28 vs 490=157 (5.6x). Standard bloom DESTROYS held items; 490 gating preserves them. Baddeley working memory: DEAD=phonological loop (hold), ALIVE=central executive (integrate at bloom). test_crt_lm_nback.ax 5/5.

PASS 5/5

MULT N-BACK CROSSOVER

Multiplicative N-Back Crossover (S2229): target = (seq[last] * seq[last-n]) % N. 490 gating marginally exceeds per-channel at n=2 (279 > 263, +16, ~1.1 sigma). CROSSOVER at n=3: per-ch(208) > 490(184), +24, ~1.8 sigma. Basin wall penalizes ALIVE bloom coupling for multiplication. DEAD preservation exact (108=108 at n=3, std=34 destroyed). The OPERATION determines gating policy: additive n-back -- 490 helps at all n>=2. Multiplicative -- 490 marginal at n=2, per-ch wins at n=3. Robust findings: DEAD/ALIVE mechanism and crossover pattern. test_crt_lm_nback_mult.ax 3/3.

PASS 5/5

2D-PLUT N-BACK CAPACITY

2D-PLUT N-Back Capacity Theorem (S2230): per-position nonlinear automata (2D-PLUT, 18750 params) exceed shared-table automata (2D-LUT, 3750 params) on n-back memory tasks. 1-back: 2D-PLUT 1107 vs 2D-LUT 415 (2.67x). 2-back: 2D-PLUT 333 vs 2D-LUT 254 (+31%). Mechanism: per-position tables allow the model to specialize different sequence positions independently -- the n-back source position, intervening tokens, and prediction position each get their own transition function. This is CRT independence at the TABLE level compounding with memory: 7 channels give independence per-prime, 5 tables give independence per-step. 490 gating preserves DEAD channels exactly on 2D-PLUT (423=423, 204=204), confirming architecture-independence of the 490 mechanism. test_crt_lm_plut490.ax 3/3.

PASS 5/5

ADDITIVE N-BACK SCALING

Additive N-Back Scaling Theorem (S2232): for additive n-back tasks with 2D-LUT at N=500, 490-gated bloom performance is indistinguishable from per-channel baseline across memory depths n=1-5. n=1: PC=415, 490=408 (-7). n=2: PC=254, 490=267 (+13). n=3: PC=221, 490=219 (-2). n=4: PC=216, 490=209 (-7). n=5: PC=184, 490=187 (+3). DEAD channel preservation is perfect: 490 DEAD tracks PC DEAD within noise at all n. Standard bloom is consistently worse, destroyed at n=3 (28/600 DEAD) with partial recovery at n=4-5 from longer sequences. The n=2 advantage is a narrow window. DEPTH-INVARIANCE: 490 gating neither helps nor hurts relative to per-channel on additive tasks. Contrast: multiplicative n-back has crossover at n=3 (S2229). test_crt_lm_nback.ax 5/5.

PASS 5/5

MULT N-BACK DEPTH SCALING

Multiplicative N-Back Depth Scaling Theorem (S2233): the n=3 crossover (per-ch > 490 by 24) does NOT persist at deeper memory. n=4: 490(259) ~= PC(253) (delta +6, <0.5 sigma). n=5: 490(243) ~= PC(244), indistinguishable. The n=3 gap dissolves into noise at n=4-5. The multiplicative landscape is NON-MONOTONIC: all methods dip at n=3 and partially recover at n=4-5 (longer sequences provide more CD training signal). PC: 351/263/208/253/244. Contrast additive: monotonic decrease 415/254/221/216/184. DEAD preservation perfect at all n: 490 DEAD tracks PC DEAD within noise. 490 gating SAFE across all operations and memory depths. test_crt_lm_nback_mult.ax 5/5.

PASS 5/5

BLOOM-AWARE TRAINING WINDOW

Bloom-Aware Training Window Theorem (S2235): bloom-aware 490 CD training exceeds per-channel CD on n-back n=2 at p/s=7.5 (268 vs 254, +5.5%), but the benefit is a WINDOW, not a floor. At p/s=3.75 (N=1000), per-ch wins (533 vs 501). THREE conditions required: (1) n>=2 (cross-channel coordination), (2) p/s 5-10 (moderate capacity constraint), (3) 490 gating. DEAD channels gain +31 (+18%) from bloom-aware training at N=500: Phase 2 optimization improves DEAD entries as informants for ALIVE bloom reconstruction. Standard bloom catastrophic (108, near chance 105). test_crt_lm_bloom_nback.ax 3/3.

PASS 5/5

D-CHANNEL CRITICALITY

D-Channel Criticality Theorem (S2236): the bloom-aware training window (THM 249) is a Z/8 phenomenon. Per-channel analysis at N=500 (fair comparison, both models no-bloom eval): Z/8 (D^3, DEAD) gains +31 (+22%, ~4 sigma) -- the SOLE large beneficiary. Z/25, Z/49: exactly 0. K-channel (Z/9, ALIVE) genuinely hurt -15 (~2.5 sigma). L(Z/11) -5, GATE(Z/13) +2, ESCAPE(Z/17) +1 (within noise). Net +14 = Z/8(+31)+GATE(+2)+ESC(+1)-K(-15)-L(-5). Mechanism: per-channel capacity q^2/N. Z/8: 64/500=0.13 (data-rich, learns informant role). Z/25: 1.25, Z/49: 4.8 (capacity-limited). Window closes at p/s=10 (N=375, delta=-11). Re=q PARTIALLY FALSIFIED: benefit in smallest DEAD channel, not largest. test_crt_lm_bloom_perch.ax 3/3.

PASS 3/3

DEPTH INTERFERENCE

Depth Interference Theorem (S2237): stacking 2D-LUT layers in CRT-LM provides no test accuracy benefit. Full re-processing (L2 re-reads sequence from bloomed state): -34 (-13.4%). 1-step refinement (L2 corrects last token only): -2 (noise). Control without bloom: -18. Root cause: per-channel CD tables are fragile -- optimized for specific initial state distributions. L2's varied initial states (from L1+bloom) expand state space, diluting per-entry optimization. L2 train (54%) < L1 train (55.3%). Bloom between layers damages ALIVE channels (-16 vs no-bloom control). IMPLICATION: CRT-LM depth requires joint cross-channel training, not per-channel layer stacking. test_crt_lm_stack.ax 2/2.

PASS 3/3

PLUT CAPACITY-GATING CROSSOVER

PLUT Capacity-Gating Crossover Theorem (S2238): 2D-PLUT per-channel (333) exceeds 2D-PLUT 490-aware (261) and 2D-LUT 490-aware (267) on n-back n=2 at N=500. Sign reversal: LUT 490-aware was +13 over LUT per-ch (S2228); PLUT 490-aware is -72 from PLUT per-ch. DEAD channels invariant: 204/600 in all 4 conditions of the 2x2 design ({per-ch,490-aware}x{per-ch,490}). ALIVE channels drive the crossover: per-ch 129 vs 490-trained 52-62. Mechanism: per-position tables give ALIVE channels position-specific patterns that bloom -- a position-INDEPENDENT perturbation -- destroys. Two capacity regimes: LOW (LUT, p/s=7.5) where 490 helps (+5.1%) and HIGH (PLUT, p/s=37.5) where 490 hurts (-21.6%). test_crt_lm_plut490_nback.ax 3/3.

PASS 3/3

MEDIAL AXIS CONVERGENCE

Medial Axis Convergence Theorem (S2239+S2244): For CRT 7-channel spatial transforms (identity, 180deg, color_invert, vflip, hflip, 90deg, transpose) applied to each cell of a 2D grid, medial axis channel diversity is always <= edge diversity. Strict inequality arises from two independent mechanisms: (1) aperiodic color content (4/4 gradient patterns) and (2) shape asymmetry under the 7 transforms (5/5 non-transform-symmetric uniform shapes: bars, T-shape, diamond, circle). Equality holds only when BOTH shape and color are transform-symmetric (6/6 centered rectangles with uniform/periodic content). S2244: 5 non-rectangular shapes resolve the diagonal proximity confound -- the effect holds for horizontal, vertical, mixed, single-point, and rotationally symmetric medial axes. test_n15_medial.ax 30/30.

PASS 6/6

STOCHASTIC SIGNAL-NOISE SEPARATION

Stochastic Signal-Noise Separation Theorem (S2240): For a stochastic +1 task (target = (last+1)%N with probability 1-p, random with probability p), 2D-LUT CRT channels separate deterministic signal from stochastic noise. Signal accuracy degrades gracefully from 580 permil (p=0) to 408 permil (p=50), while noise accuracy stays at chance (58-72 permil). Separation ratio 5.7x-9.3x across all tested noise levels. b-channel (Z/49, 2401 entries) memorizes 91% of noise on training data but 0% on test data: high per-channel capacity enables noise overfitting. D-channel (Z/8, 64 entries) is immune to noise memorization (0-5 hits across all noise levels). Channel capacity determines noise vulnerability. First non-deterministic CRT-LM experiment. test_crt_lm_stochastic.ax 10/10.

PASS 6/6

STOCHASTIC SCALING

Stochastic Scaling Theorem (S2241): Signal-noise separation IMPROVES with more training data. At N=1000 vs N=500: signal permil increases at all noise levels (709 vs 580 at p=0, 719 vs 542 at p=10, 600 vs 447 at p=30, 487 vs 408 at p=50). Noise permil stays at chance (40-60 permil). Signal/noise ratio nearly doubles: 18.0x vs 9.3x (p=10), 10.0x vs 7.6x (p=30), 8.4x vs 5.7x (p=50). b-channel (Z/49) maintains perfect test noise rejection at both scales. CRT channel independence provides scaling noise immunity: more data improves signal without improving noise. The p=10>p=0 observation was FALSIFIED as single-seed artifact (THM 256). test_crt_lm_stochastic.ax 23/23.

PASS 6/6

NOISE REGULARIZATION FALSIFICATION

Noise Regularization Falsification (S2242): The S2241 observation that p=10% noise at N=1000 yielded higher signal than p=0 (719 vs 709 permil, delta=+12) was a single-seed artifact. Multi-seed verification with 5 independent data seeds, training 2D-LUT on p=0 and p=10 data and evaluating both on the same clean p=0 test set: 4/5 seeds show noise HURTS (deltas: +12, -23, -51, -13, -1). Mean delta = -15 permil. Cross-seed variance 41 permil (p=0 range: 697-738). The +12 permil observation is well within this variance. Signal robust under noise: all p=10 seeds >= 670 permil (>> chance 58 permil). CONCLUSION: noise at 10% provides no regularization benefit for CRT 2D-LUT training. The stochastic scaling theorem (THM 255) remains valid minus the noise-as-regularizer claim. test_crt_lm_stoch_reg.ax 13/13.

PASS 6/6

POSITION AMBIGUITY

Position Ambiguity Theorem (S2245): When a sequence prediction task requires different operations at different positions (add at even, multiply at odd), a position-free 2D-LUT automaton achieves WORSE accuracy than either pure operation alone. At N=500, seq_len=5: pure addition 100%, pure multiplication 27.3%, alternating 9.9%. The naive prediction (alternating accuracy between components) is FALSIFIED. Root cause: DESTRUCTIVE INTERFERENCE. CD optimizes each LUT entry for the operation that appears most often at that (h,x) pair. For entries shared between add and mul positions, the chosen operation DAMAGES the minority usage. Pure tasks avoid this: all positions agree. Per-channel: Z/8 accuracy collapses from 82.5% (pure mul) to 18% (alt). Initialization is irrelevant (additive vs multiplicative init converge to same result, diff < 1%). IMPLICATION: position-free CRT-LM architectures are fundamentally limited to homogeneous operation tasks. Heterogeneous tasks require position-aware models (PLUT/2D-PLUT) that maintain separate tables per position. This explains why PLUT (S2200, +83.6% over LUT) and 2D-PLUT (S2215, +20pp) outperform on tasks with position-dependent structure. test_crt_lm_alt.ax 7/7.

PASS 7/7

POSITION-AWARENESS DISSOLUTION

Position-Awareness Dissolution Theorem (S2246): Position-aware 2D-PLUT partially dissolves the position ambiguity of THM 257 on the alternating add/mul task. Correct initialization (additive at even positions, multiplicative at odd) achieves 100%/100% train/test, confirming the architecture is expressive enough. From additive default, CD achieves 76.3%/15.8% (2D-PLUT) vs 48.0%/9.9% (2D-LUT) -- a 59% improvement on both train and test. The improvement ratio is identical (1.59x), meaning position-awareness shifts accuracy without changing overfit dynamics (12.1x ratio, both). Per-channel: Z/8 gains +119% (36->79), Z/9 +57% (37->58); Z/49 unchanged (4). MECHANISM: position separation lets even-position tables retain their correct additive default while odd-position tables face the multiplicative basin wall (THM 246) independently -- no destructive interference. But basin crossing at large-q odd positions remains the bottleneck. The benefit is capacity-gated: only channels where q^2/N is small enough for CD to explore can cross. test_crt_lm_alt_plut.ax 6/6.

PASS 6/6

PARITY DETECTION

Parity Detection Theorem (S2247): On value-conditional tasks (add if input even, multiply if odd), only the D-channel (q=8, the sole even modulus) can detect input parity. Root cause: x%q preserves parity iff q is even (since kq is even, x and x%q share parity). Among the 7 CRT moduli {8,9,25,49,11,13,17}, only 8 is even. Per-channel 2D-LUT: D=96% (CD) / 100% (correct init). Odd channels: K=20%, E=5%, b=2%, L=13%, GATE=10%, ESCAPE=7.5% -- near their chance levels (1/q). D contributes 62.5% of all aggregate test hits despite being 1 of 7 channels. Aggregate conditional (21.9%) beats alternating (9.9%, THM 257) by 2.2x because D faces zero interference: each (h, x%8) entry has a unique correct target (even x%8 -> add, odd x%8 -> mul). Odd channels face 50% destructive interference on ~90% of entries (both add and mul targets share each residue class). STRUCTURAL IMPLICATION: for parity-conditional tasks, cross-channel information sharing is necessary -- odd channels require D-channel parity signal. D IS the parity bridge. test_crt_lm_cond.ax 6/6.

PASS 6/6

CONDITIONAL BLOOM SHARING

Conditional Bloom Sharing Theorem (S2248): On value-conditional tasks (add if even, multiply if odd), 490-gated bloom enables cross-channel parity sharing from D to ALIVE channels. Three conditions at N=500: (A) per-channel 2D-LUT CD baseline: D=96%, odd channels near chance (K=20%, GATE=10%), total 307/1400. (B) Standard bloom-aware: D DESTROYED 192->50 (-74%), total 185/1400 (-40%). Bloom mixes all channels, corrupting D parity with odd-channel noise. (C) 490-gated bloom-aware: D PERFECTED 200/200 (100%), K=53 (+32.5%), GATE=29 (+45%), total 339/1400 (+10.4%). MECHANISM: 490-gated bloom preserves DEAD channels {D,E,b} while mixing ALIVE channels {K,L,GATE,ESC}. D broadcasts parity information through bloom reconstruction floor(P/7)%q. ALIVE channels receive parity-encoded cross-channel signal, enabling partial conditional discrimination. DEAD preserved: 213 vs 206 (+7). ALIVE lifted: 126 vs 101 (+24.8%). Standard bloom DEAD: 70 (3x worse than 490). FIRST TASK WHERE BLOOM IS STRUCTURALLY NECESSARY: odd channels NEED D parity signal and 490-gated bloom is the ONLY architecture that delivers it without destroying D. Confirms D as the parity bridge (THM 259) and 490 split as the essential gating mechanism. test_crt_lm_cond_bloom.ax 6/6.

PASS 7/7

DIVISIBILITY SPECTRUM

Divisibility Spectrum Theorem (S2248): Per-prime divisibility conditioning (add if x%p==0, else multiply) reveals two detection regimes across 7 CRT channels. THIN CHANNELS (q=p, depth 1): L(11), GATE(13), ESCAPE(17) each achieve 100% on their own-prime divisibility. The detecting channel has a clean 1-vs-rest LUT partition (x%q==0 vs nonzero). FAT DATA CHANNELS (q=p^2, depth 2): K(9), E(25), b(49) are NOT best on their own-prime conditions. D(8) dominates all four fat-prime conditions (p=2,3,5,7). Root cause: q^2 table entries vs N training examples. K(81 entries), E(625), b(2401) are capacity-starved at N=500. D(64 entries) has highest capacity/sample ratio. The axiom's Pareto exponents create a structural hierarchy where thin channels detect conditions cleanly while fat channels provide resolution but need more data. test_crt_lm_divisibility.ax 7/7.

PASS 6/6

490 SPLIT BLOOM ASYMMETRY

490 Split Bloom Asymmetry Theorem (S2249): When the task-relevant sensor is DEAD (THM 260): standard bloom catastrophic, 490-gated LIFTS ALIVE. When ALIVE: standard bloom HELPS, 490-gated neutral. Root cause: DEAD channels protected from bloom at eval by 490 gating. ALIVE channels are IN the bloom channel -- eval bloom washes their own signal. ARCHITECTURAL IMPLICATION: the 490 split defines an information flow direction: DEAD->ALIVE through bloom. Perception channels (DEAD) process privately; strategy channels (ALIVE) coordinate publicly. test_crt_lm_cond_bloom_k.ax 6/6.

263. Fat Channel Capacity Crossover

THM 261 showed D dominates ALL fat-prime divisibility conditions at N=500. This theorem resolves WHY and WHEN: D's dominance is a finite-sample artifact. Each channel reaches 100% on its own prime when data saturates its 2D-LUT (N >= ~10*q^2). Three crossover points confirmed:

PASS 8/8

FAT CHANNEL CAPACITY CROSSOVER

Fat Channel Capacity Crossover Theorem (S2250): D's dominance on fat-prime divisibility conditions (THM 261) is a FINITE-SAMPLE ARTIFACT. Each CRT channel reaches 100% on its own prime's condition when data saturates its 2D-LUT: K crosses at N=1000, E at N=5000, b predicted at N~24000. Non-own channels plateau at a ceiling independent of N. The axiom's channel assignments are CORRECT -- capacity, not identity, was the bottleneck. test_crt_lm_div_scale.ax 8/8.

264. Bloom Training-Only

Lead (w): is bloom a TRAINING tool or INFERENCE tool for ALIVE sensors? On conditional div-by-3 task (sensor=K=ALIVE), 5 configurations compared.

PASS 7/7

BLOOM TRAINING-ONLY

Bloom is a TRAINING tool, not inference tool, for ALIVE sensors. 490-gated training + no-bloom inference = optimal for ALIVE channels. ASYMMETRIC GATING POLICY: ALIVE sensors = train with bloom, infer without. DEAD sensors = train and infer with 490 bloom (THM 260). test_crt_lm_bloom_train_only.ax 7/7.

265. E-Channel Conditional Capacity

Div-by-5 conditional: E(Z/25) preserves div-by-5. E=100% correct init but 8.5% from CD at N=500 (capacity-gated). D dominates (61.5%). At N=2000: E improves +135%.

PASS 6/6

E-CHANNEL CONDITIONAL CAPACITY

On div-by-5 conditional task, E-channel achieves 100% with correct init but only 8.5% from CD at N=500 -- capacity-gated. Extends THM 259 and THM 263 to conditional tasks. test_crt_lm_e_cond.ax 6/6.

266. Multi-Prime Cooperation

FIRST task irreducible to any single channel. Condition: x%10==0. D detects parity, E detects div-by-5. Neither alone resolves x%(D*E)==0.

PASS 7/7

MULTI-PRIME COOPERATION

Div-by-10 conditional is the first task no single CRT channel can resolve. Standard bloom (+27, +5%) outperforms 490-gated (+2) because D and E are both DEAD: 490 prevents their cross-talk. REFINED GATING POLICY: single-DEAD -> 490, multi-DEAD -> standard, ALIVE -> standard. test_crt_lm_multi_cond.ax 7/7.

267. Mixed DEAD+ALIVE Cooperation

COMPLETES gating policy (4th of 4 cases). Condition: x%6==0 (D=DEAD + K=ALIVE).

PASS 6/6

MIXED DEAD+ALIVE COOPERATION

Div-by-6 conditional requires D (DEAD) + K (ALIVE) cooperation. Standard bloom best. COMPLETES GATING POLICY: single-DEAD -> 490, multi-DEAD -> standard, ALIVE -> standard, mixed -> standard. test_crt_lm_mixed_cond.ax 7/7.

268. K-Ternary Detection

K=3 compositional task: 3-way branching on x%3. K(Z/9) is sole sensor: 3|9.

PASS 6/6

K-TERNARY DETECTION

K(Z/9) achieves 100% from CD on 3-way x%3 branching -- first data channel at 100% via CD on multi-class. Bloom CATASTROPHIC (-67%). Perfect-accuracy sole sensor -> per-channel ONLY. test_crt_lm_ternary.ax 7/7.

269. Triple-Prime Conditional

First 3-channel irreducible conditional: x%30==0 (2 DEAD + 1 ALIVE).

PASS 6/6

TRIPLE-PRIME CONDITIONAL

First 3-channel irreducible conditional: div-by-30. Standard bloom best (+2.5%). Gating rules COMPOSE at higher cooperation depth. CONFIRMS: multi-channel cooperation of any depth -> standard bloom. test_crt_lm_triple_cond.ax 7/7.

270. b(Z/49) Capacity Crossover

CONFIRMS THM 263 at full scale. Div-by-7 at N=1K/10K/25K.

PASS 8/8

b CAPACITY CROSSOVER VERIFICATION

CONFIRMS THM 263: b(Z/49) reaches 96.4% on div-by-7 at N=25000, overtaking D(68.2%). D is perfectly flat at 341/500 regardless of N. The axiom's channel assignments are capacity-correct at ALL scales. test_crt_lm_b_crossover.ax 8/8.

271. Memory Barrier -- Copy Task

First non-arithmetic task. Target = first element (memory/preservation).

PASS 7/7

MEMORY BARRIER THEOREM

2D-LUT has capacity for copy but CD CANNOT reach preserve-state basin from additive default. PLUT dissolves the barrier: position-awareness makes copy trivially separable. Architecture must match task structure, not just have capacity. test_crt_lm_copy.ax 7/7.

272. Preservation Depth Scaling

Copy task at SL=5, 10, 20. PLUT remains PERFECT. 2D-LUT preserve-init degrades; Z/9 COLLAPSES at SL=20.

PASS 7/7

PRESERVATION DEPTH THEOREM

PLUT is depth-invariant. 2D-LUT degrades gracefully overall but Z/9 (K-channel) collapses from 93.5% to 10% at SL=20 (phase transition). Z/9 accounts for 98% of total decline. test_crt_lm_copy_depth.ax 9/9.

273. Seed-Depth Interaction

Multi-seed verification: Z/9 collapse is SEED-DEPENDENT (1/4 seeds), not structural.

PASS 8/8

SEED-DEPTH INTERACTION THEOREM

Z/9 collapse requires BOTH vulnerable seed AND sufficient depth. 3/4 seeds normal at SL=20. K-SPECIFIC: Z/8 immune regardless of seed or depth. SOFTENS THM 272. test_crt_lm_copy_transition.ax 13/13.

274. Position-Capacity Gating

Copy n-back: target=seq[n] for n=0..4. PLUT shows strong position gradient.

PASS 8/8

POSITION-CAPACITY GATING THEOREM

PLUT accuracy decreases sharply with target position. Small channels (Z/8, Z/11) position-immune. Large channels degrade from insufficient samples per entry. EXTENDS capacity-gating to position domain. test_crt_lm_copy_nback.ax 12/12.

275. Pattern Extraction Asymmetry

First abstraction task: extract common difference d from arithmetic sequence. ARCHITECTURE INVERSION: 2D-LUT > PLUT default.

PASS 7/7

PATTERN EXTRACTION ASYMMETRY THEOREM

Task structure determines architecture winner. Copy: PLUT >> 2D-LUT. Pattern: 2D-LUT > PLUT from default. 2D-LUT exploits (h,x) pairs for subtraction. test_crt_lm_pattern.ax 9/9.

276. 2D-PLUT Compounding

Position + state COMPOUND on pattern extraction. 2D-PLUT exceeds both components.

PASS 7/7

2D-PLUT COMPOUNDING THEOREM

Position-awareness and state-dependence compound. 2D-PLUT (81.4%) exceeds both 2D-LUT (53.2%) and PLUT (38.4%). Optimal CRT-LM architecture: 2D-PLUT subsumes both. test_crt_lm_2dplut_pattern.ax 7/7.

277. Coordination Barrier

2D-PLUT on copy task. Per-position independence does NOT dissolve basin wall.

PASS 5/5

COORDINATION BARRIER THEOREM

2D-PLUT is architecturally universal but NOT universally trainable from default. Initialization IS the inductive bias: memory tasks need preserve-state init, computation tasks work from additive default. 2D-PLUT + task-appropriate init = universal CRT-LM architecture. test_crt_lm_2dplut_copy.ax 7/7.

278. Random Init Equivalence

K^2=9 diverse random 2D-PLUT inits on copy. FALSIFIED: all at chance.

PASS 6/6

RANDOM INIT EQUIVALENCE THEOREM

K^2=9 diverse random inits all at chance on copy. Coordination barrier is STRUCTURAL, not init-specific. Basin wall requires EXPLICIT inductive bias, not random search. test_crt_lm_evolve_init.ax 4/4.

279. Evolutionary Barrier

100-gen directed evolution on copy task with 2D-PLUT. Neither random search (THM 278), CD (THM 277), nor evolution can cross the barrier.

PASS 6/6

EVOLUTIONARY BARRIER THEOREM

Directed evolution CANNOT cross the coordination barrier. Barrier structural at ALL TABLE-search levels: CD (THM 277), random (THM 278), evolution (THM 279) -- it is a property of INDEPENDENT table entries. It IS crossed by parameter-sharing + gradient (an un-collapsed recurrent cell crosses copy and pattern) -- the convergent route, not a hand-constructed one. test_crt_lm_evolve_gen.ax 9/9.

280. Developmental Programs

5 rules (CAPTURE, PRESERVE, ADD, ZERO, SUBTRACT) as developmental program vocabulary. Each rule generates a full 2D-PLUT position table from a single choice. 5-gene genome (1 rule per position) compresses 18750 entries to 5 choices (3750x).

PASS 9/9

DEVELOPMENTAL PROGRAM THEOREM

Five developmental rules (CAPTURE: h=x, PRESERVE: h=h, ADD: h=(h+x)%q, ZERO: h=0, SUBTRACT: h=(h-x+q)%q) compress 18750 2D-PLUT entries to a 5-gene genome (3750x). Copy=[0,1,1,1,1]=100% (2 optima). Pattern=[4,2,1,1,1]=100% (45 optima). Same vocabulary, both tasks. Coordinate search (26 evals): copy succeeds (pos 0 alone suffices), pattern FAILS (pos 0+1 must coordinate). Exhaustive search (3125 genomes): BOTH solved. The coordination barrier persists at ALL levels -- entries (THMs 277-279), rules (coordinate search) -- and is dissolved ONLY by exhaustive enumeration in compressed space. Genomes encode rules, not weights. test_crt_lm_dev_prog.ax 9/9.

Summary

534 / 534 checks verified

CRT Prediction & CRT-LM -- 81 theorems (92-124, 230, 234-280). Ring algebra foundations, CRT-LM architecture (10 models), basin geometry, bloom gating policy (4 cases), channel capacity scaling, 490 split dynamics, memory vs computation, pattern extraction, architecture compounding, coordination barrier, random init equivalence, evolutionary barrier, developmental programs.