arXiv:2604.01178v3 Announce Type: replace-cross
Abstract: A core limitation of standard softmax attention is that it does not provide an independently interpretable measure of query–key relevance: attention scores are unbounded, while attention weights are defined only relative to competing keys. Consequently, irrelevant keys cannot be explicitly rejected, and some attention mass is assigned even when no key is genuinely relevant. We introduce Multiscreen, a language-model architecture built around a mechanism we call screening, which enables absolute query–key relevance. Instead of redistributing attention across all keys, screening computes bounded query–key similarities and applies an explicit threshold, discarding irrelevant keys and aggregating the remaining keys without global competition. Across experiments, Multiscreen achieves comparable validation loss with roughly 30% fewer parameters than a Transformer baseline and remains stable at substantially larger learning rates. It maintains stable long-context perplexity beyond the training context and shows little degradation in retrieval performance as context length increases. Finally, Multiscreen achieves lower full-context forward-pass latency at long context lengths.
Crisis support teams’ technological openness and learning attitudes toward the AI based virtual patient system crisis support VR
BackgroundAgainst the backdrop of escalating global humanitarian crises, innovative didactic simulations are becoming increasingly important. A promising alternative to traditional classroom-based didactics for learning psychological