arXiv:2605.20244v1 Announce Type: cross Abstract: We present Lean Refactor, a plug-and-play retrieval-augmented agentic framework for multi-objective, controllable, and version-robust refactoring of Lean proofs. LLM-generated proofs are notoriously correct-but-verbose and brittle across library versions, yet existing refactoring works overlook three practical challenges: 1) Lean refactoring is natively multi-objective (proof length, compilation cost, and version compatibility are […]
Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints
arXiv:2605.21085v1 Announce Type: cross Abstract: Communication enables coordination in multi-agent reinforcement learning (MARL), but many real-world applications, e.g., search-and-rescue with drone swarms, operate under severe bandwidth constraints. Many communication architectures still expose a coupled bottleneck in which a shared latent representation is used for both policy execution and inter-agent communication. Consequently, reducing message size directly […]
ProcBench: Evaluating Process-Level Defects and Control Preservation in LLM Coding Agents
arXiv:2605.20251v1 Announce Type: cross Abstract: Existing benchmarks for LLM coding agents mainly evaluate final outcomes, such as task completion, compilation success, and test pass rates. While these metrics are useful for measuring end-task capability, they provide limited visibility into how an execution unfolds and often miss recurrent process-level failures that arise during multi-step operation. We […]
The Impact of AI Search on the Online Content Ecosystem: Evidence from Google and Reddit
arXiv:2605.16428v2 Announce Type: replace-cross Abstract: Search engines traditionally complement online content platforms by directing users seeking information to external websites. The emergence of generative AI search tools that summarize answers directly on the results page may disrupt this relationship by making visits to source platforms optional. We study this question using Google AI Overviews and […]
Instance Discrimination for Link Prediction
arXiv:2605.20257v1 Announce Type: cross Abstract: Recently, instance discrimination models have emerged as a major solution for self-supervised learning. Having already demonstrated its effectiveness in the image domain, instance discrimination learning is now proving equally convincing in the graph domain, in particular for node classification. However, fewer contributions have tackled the link prediction task. In this […]
Fine-grained Claim-level RAG Benchmark for Law
arXiv:2605.21071v1 Announce Type: cross Abstract: The rapid progress of large language models (LLMs) is shifting semantic search toward a question-answering paradigm, where users ask questions and LLMs generate responses. In high-stake domains such as law, retrieval-augmented generation (RAG) is commonly used to mitigate hallucinations in generated responses. Nonetheless, prior work shows that RAG systems, whether […]
Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding
arXiv:2605.20268v1 Announce Type: cross Abstract: Real-world time series come with text: metadata, descriptions, news, reports. Yet time series foundation models process numerical sequences in isolation, and the multimodal text-and-time-series models that attempt to bridge the two all adapt a pretrained language model post hoc, inheriting representations shaped without ever seeing temporal data. These models are […]
Bridging Silicon and the Hippocampus: Algebro-Deterministic Memory “VaCoAl” as a Substrate for Vector-HaSH and TEM
arXiv:2605.15652v3 Announce Type: replace-cross Abstract: Vector-HaSH and the Tolman-Eichenbaum Machine propose the hippocampal-entorhinal circuit factorizes content from a grid-cell scaffold, supporting compositional memory via ripple-mediated replay. Human electrophysiology shows multi-hop replay fidelity decays multiplicatively. We show VaCoAl, an algebro-deterministic hyperdimensional memory built from Galois-field linear-feedback shift registers, supplies a shared algebraic object.Specifically: (i) deterministic Galois-field […]
Modality-Decoupled Online Recursive Editing
arXiv:2605.20273v1 Announce Type: cross Abstract: Online model editing for multimodal large language models (MLLMs) requires assimilating a stream of corrections under tight compute and memory budgets. Yet editors developed for text-only LLMs often degrade on MLLMs: visually dominant activations skew the statistics that shape updates, causing cross-modal conflict, while sequential writes become entangled in a […]
Grounding Driving VLA via Inverse Kinematics
arXiv:2605.21061v1 Announce Type: cross Abstract: Existing Driving VLAs predict trajectories while largely ignoring their visual tokens — a phenomenon we trace not to insufficient training but to a structurally ill-posed task formulation. We show that trajectory recovery, when viewed through the lens of inverse kinematics, requires both a current and a future visual state as […]
ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison
arXiv:2605.20278v1 Announce Type: cross Abstract: Long-form image captioning exposes a reward granularity problem in RL: captions are judged as whole sequences, while the important errors occur at the level of individual visual claims. A good dense caption should be both faithful and informative, avoiding hallucination without omitting salient details. Yet pairwise preferences, reference-based metrics, and […]
MeMo: Memory as a Model
arXiv:2605.15156v2 Announce Type: replace-cross Abstract: Large language models (LLMs) achieve strong performance across a wide range of tasks, but remain frozen after pretraining until subsequent updates. Many real-world applications require timely, domain-specific information, motivating the need for efficient mechanisms to incorporate new knowledge. In this paper, we introduce MeMo (Memory as a Model), a modular […]