arXiv:2603.28964v2 Announce Type: replace-cross
Abstract: We develop the spectral edge thesis: phase transitions in neural network training — grokking, capability gains, loss plateaus — are controlled by the spectral gap of the rolling-window Gram matrix of parameter updates. In the extreme aspect ratio regime (parameters $P sim 10^8$, window $W sim 10$), the classical BBP detection threshold is vacuous; the operative structure is the intra-signal gap separating dominant from subdominant modes at position $k^* = mathrmargmax, sigma_j/sigma_j+1$.
From three axioms we derive: (i) gap dynamics governed by a Dyson-type ODE with curvature asymmetry, damping, and gradient driving; (ii) a spectral loss decomposition linking each mode’s learning contribution to its Davis–Kahan stability coefficient; (iii) the Gap Maximality Principle, showing that $k^*$ is the unique dynamically privileged position — its collapse is the only one that disrupts learning, and it sustains itself through an $alpha$-feedback loop requiring no assumption on the optimizer. The adiabatic parameter $mathcalA = |Delta G|_F / (eta, g^2)$ controls circuit stability: $mathcalA ll 1$ (plateau), $mathcalA sim 1$ (phase transition), $mathcalA gg 1$ (forgetting).
Tested across six model families (150K–124M parameters): gap dynamics precede every grokking event (24/24 with weight decay, 1/24 without), the gap position is optimizer-dependent (Muon: $k^*=1$, AdamW: $k^*=2$ on the same model), and 19/20 quantitative predictions are confirmed. The framework is consistent with the edge of stability, Tensor Programs, Dyson Brownian motion, the Lottery Ticket Hypothesis, and neural scaling laws.
Assessing nurses’ attitudes toward artificial intelligence in Kazakhstan: psychometric validation of a nine-item scale
BackgroundArtificial intelligence (AI) is increasingly integrated into healthcare, yet the attitudes and knowledge of nurses, who are the key mediators of AI implementation, remain underexplored.


