BiSpikCLM: A Spiking Language Model integrating Softmax-Free Spiking Attention and Spike-Aware Alignment Distillation

ROAD: Adaptive Data Mixing for Offline-to-Online Reinforcement Learning via Bi-Level Optimization

arXiv:2605.14497v1 Announce Type: cross Abstract: Offline-to-online reinforcement learning harnesses the stability of offline pretraining and the flexibility of online fine-tuning. A key challenge lies in

Web Agents Should Adopt the Plan-Then-Execute Paradigm

arXiv:2605.14290v1 Announce Type: cross Abstract: ReAct has become the default architecture across LLM agents, and many existing web agents follow this paradigm. We argue that

Data-Augmented Game Starts for Accelerating Self-Play Exploration in Imperfect Information Games

arXiv:2605.14379v1 Announce Type: cross Abstract: Finding approximate equilibria for large-scale imperfect-information competitive games such as StarCraft, Dota, and CounterStrike remains computationally infeasible due to sparse

HodgeCover: Higher-Order Topological Coverage Drives Compression of Sparse Mixture-of-Experts

arXiv:2605.13997v1 Announce Type: cross Abstract: Sparse Mixture-of-Experts (MoE) layers route tokens through a handful of experts, and learning-free compression of these layers reduces inference cost

Why Retrieval-Augmented Generation Fails: A Graph Perspective

arXiv:2605.14192v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) has become a powerful and widely used approach for improving large language models by grounding generation in