arXiv:2602.21204v3 Announce Type: replace-cross
Abstract: Test-time training (TTT) with KV binding as sequence modeling layer is commonly interpreted as a form of online meta-learning that memorizes a key-value mapping at test time. However, our analysis reveals multiple phenomena that contradict this memorization-based interpretation. Motivated by these findings, we revisit the formulation of TTT and show that a broad class of TTT architectures can be expressed as a form of learned linear attention operator. Beyond explaining previously puzzling model behaviors, this perspective yields multiple practical benefits: it enables principled architectural simplifications, admits fully parallel formulations that preserve performance while improving efficiency, and provides a systematic reduction of diverse TTT variants to a standard linear attention form. Overall, our results reframe TTT not as test-time memorization, but as learned linear attention with enhanced representational capacity. Project page: https://research.nvidia.com/labs/sil/projects/tttla/.
Crisis support teams’ technological openness and learning attitudes toward the AI based virtual patient system crisis support VR
BackgroundAgainst the backdrop of escalating global humanitarian crises, innovative didactic simulations are becoming increasingly important. A promising alternative to traditional classroom-based didactics for learning psychological