arXiv:2605.05811v1 Announce Type: new Abstract: Workbook-scale spreadsheet understanding is increasingly important for language-model-based data analysis agents, but remains challenging because relevant information is often distributed across multiple sheets with heterogeneous schemas, layouts, and implicit relationships. Existing retrieval-augmented approaches typically decompose spreadsheets into rows, columns, or blocks to improve scalability; however, such chunk-centric representations can fragment […]
ANCORA: Learning to Question via Manifold-Anchored Self-Play for Verifiable Reasoning
arXiv:2604.27644v2 Announce Type: replace-cross Abstract: We propose a paradigm shift toward open-ended curriculum self-play: rather than learning to answer on a fixed prompt set, a unified policy learns to question: generating verifiable problems, solving them, and turning verifier feedback into self-improvement without human-annotated solutions. We introduce ANCORA, in which the policy alternates between a Proposer […]
MolRecBench-Wild: A Real-World Benchmark for Optical Chemical Structure Recognition
arXiv:2605.05832v1 Announce Type: new Abstract: Optical Chemical Structure Recognition (OCSR) aims to translate molecular diagrams in scientific literature into machine-readable formats, but current systems remain unreliable on real-world images due to substantial visual and chemical complexity. We introduce MOSAIC, a dual-dimensional difficulty framework with 37 fine-grained labels that jointly characterize visual interference and chemical semantic […]
Milestone-Guided Policy Learning for Long-Horizon Language Agents
arXiv:2605.06078v1 Announce Type: cross Abstract: While long-horizon agentic tasks require language agents to perform dozens of sequential decisions, training such agents with reinforcement learning remains challenging. We identify two root causes: credit misattribution, where correct early actions are penalized due to terminal failures, and sample inefficiency, where scarce successful trajectories result in near-total loss of […]
SANEmerg: An Emergent Communication Framework for Semantic-aware Agentic AI Networking
arXiv:2605.05861v1 Announce Type: new Abstract: Future networking systems are envisioned to become part of an agentic AI-native ecosystem in which a vast number of heterogeneous and specialized AI agents cooperate seamlessly to fulfill complex user requirements in real time. However, traditional networking paradigms are characterized by a rigid decoupling of communication and computation, which often […]
How Fast Should a Model Commit to Supervision? Training Reasoning Models on the Tsallis Loss Continuum
arXiv:2604.25907v2 Announce Type: replace-cross Abstract: SFT-then-RLVR is widely used for post-training reasoning models, but why this specific ordering, and why RLVR-only stalls at cold start, have lacked a unifying theoretical account. We provide that account under a unified loss family $J_Q$ using the Tsallis $q$-logarithm. $J_Q$ is a single-parameter family that interpolates between RLVR (at […]
Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning
arXiv:2605.05909v1 Announce Type: new Abstract: The core challenge of machine unlearning is to strike a balance between target knowledge removal and non-target knowledge retention. In the context of Multimodal Large Language Models (MLLMs), this challenge becomes even more pronounced, as knowledge is further divided into visual and textual modalities that are tightly intertwined. In this […]
Normalized Architectures are Natively 4-Bit
arXiv:2605.06067v1 Announce Type: cross Abstract: Training large language models at 4-bit precision is critical for efficiency. We show that nGPT, an architecture that constrains weights and hidden representations to the unit hypersphere, is inherently more robust to low-precision arithmetic. This removes the need for interventions-such as applying random Hadamard transforms and performing per-tensor scaling calculations-to […]
Which Are the Low-Resource Languages of the Semantic Web?
arXiv:2605.05929v1 Announce Type: new Abstract: Emerging digital technologies are exacerbating the existing divide in Open Access Data (OAD) between high-and low-resource languages, excluding many communities from the global digital transformation. Multilingual Linked Open Data Knowledge Graphs (LOD KGs) could contribute to mitigating this divide through cross-lingual transfer; however, no clear quantitative definition of low-resource languages […]
Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning
arXiv:2604.22031v2 Announce Type: replace-cross Abstract: We propose Mochi, a Graph Foundation Model that addresses task unification and training efficiency by adopting a meta-learning based training framework. Prior models pre-train with reconstruction-based objectives such as link prediction, and assume that the resulting representations can be aligned with downstream tasks through a separate unification step such as […]
HaM-World: Soft-Hamiltonian World Models with Selective Memory for Planning
arXiv:2605.05951v1 Announce Type: new Abstract: World models enable model-based planning through learned latent dynamics, but imagined rollouts become unstable as the planning horizon grows or the dynamics distribution shifts. We argue that this instability reflects two missing structures in planner-facing latents: history-conditioned memory for approximate Markov completeness, and geometric organization that separates configuration, momentum, and […]
Causal Reinforcement Learning for Complex Card Games: A Magic The Gathering Benchmark
arXiv:2605.06066v1 Announce Type: cross Abstract: Causal reinforcement learning (RL) lacks benchmarks for complex systems that combine sequential decision making, hidden information, large masked action spaces, and explicit causal structure. We introduce MTG-Causal-RL, a Gymnasium benchmark built on Magic: The Gathering with a 3,077-dimensional partial observation, a 478-action masked discrete action space, five competitive Standard archetypes, […]