arXiv:2604.04384v1 Announce Type: cross
Abstract: Across every attention head in five transformer language models (124M–7B parameters, four architecture families), the logit energy field $tildeE$ reaches 90% of its variance in 2–11 singular components. The emphlearned interaction matrix $W_Q^mathrmT W_K$ needs 38–75 components for the same threshold out of $d_h in 64, 128$. The spectral gap is $5$–$25times$ in effective rank. The attention mechanism allocates capacity uniformly across all $d_h$ dimensions, but language concentrates the actual interaction into a few. The compressibility of softmax-attended language is a property of the data, not the frame that analyzes it.
WearBCI Dataset: Understanding and Benchmarking Real-World Wearable Brain-Computer Interfaces Signals
arXiv:2604.09649v1 Announce Type: cross Abstract: Brain-computer interfaces (BCIs) have opened new platforms for human-computer interaction, medical diagnostics, and neurorehabilitation. Wearable BCI systems, which typically employ


