arXiv:2605.10574v2 Announce Type: replace Abstract: As artificial intelligence advances, models are not improving uniformly. Instead, progress unfolds in a jagged fashion, with capabilities growing unevenly across tasks, domains, and model scales. In this work, we examine this dynamic jaggedness through the lens of scientific idea generation. We introduce SciAidanBench, a benchmark of open-ended scientific questions […]
Decoding species coexistence: A reinforcement learning perspective
arXiv:2508.17599v2 Announce Type: replace Abstract: A central goal in ecology is to understand how biodiversity is maintained. Previous theoretical works have employed the rock-paper-scissors (RPS) game as a toy model, demonstrating that population mobility is crucial in determining the species’ coexistence. One key prediction is that biodiversity is jeopardized and eventually lost when mobility exceeds […]
SDM: A Powerful Tool for Evaluating Model Robustness
arXiv:2605.20308v1 Announce Type: cross Abstract: Gradient-based attacks are important methods for evaluating model robustness. However, since the proposal of APGD, it has been difficult for such methods to achieve significant breakthroughs. To achieve such an effect, we first analyze the issue of “high-loss non-adversarial examples” that degrades attack performance in previous methods, and prove that […]
IMPACT: Influence Modeling for Open-Set Time Series Anomaly Detection
arXiv:2603.29183v3 Announce Type: replace-cross Abstract: Open-set anomaly detection (OSAD) is an emerging paradigm designed to utilize limited labeled data from anomaly classes seen in training to identify both seen and unseen anomalies during testing. Current approaches rely on simple augmentation methods to generate pseudo anomalies that replicate unseen anomalies. Despite being promising in image data, […]
Towards Autonomous Mechanistic Reasoning in Virtual Cells
arXiv:2604.11661v3 Announce Type: replace-cross Abstract: Large language models (LLMs) have recently gained significant attention as a promising approach to accelerate scientific discovery. However, their application in open-ended scientific domains such as biology remains limited, primarily due to the lack of factually grounded and actionable explanations. To address this, we introduce a structured explanation formalism for […]
Proportional Selection in Networks
arXiv:2502.03545v2 Announce Type: replace-cross Abstract: We address the problem of selecting $k$ representative nodes from a network, aiming to achieve two objectives: identifying the most influential nodes and ensuring the selection proportionally reflects the network’s diversity. We propose two approaches to accomplish this, analyze them theoretically, and demonstrate their effectiveness through a series of experiments.
When Does Adaptation Win? Scaling Laws for Meta-Learning in Quantum Control
arXiv:2601.18973v4 Announce Type: replace-cross Abstract: Quantum hardware suffers from intrinsic device heterogeneity and environmental drift, forcing practitioners to choose between suboptimal non-adaptive controllers or costly per-device recalibration. We derive a scaling law lower bound for meta-learning showing that the adaptation gain (expected fidelity improvement from task-specific gradient steps) saturates exponentially with gradient steps and scales […]
Epistemic Regret Minimization: Label-Free Causal Critique Beyond Outcome Reward
arXiv:2602.11675v4 Announce Type: replace Abstract: Large language models can answer causal questions correctly for the wrong reasons. Current RL methods reward emphwhat a model concludes but ignore emphwhy, reinforcing correlational shortcuts — a failure we call emphReward Entrenchment. We introduce emphEpistemic Regret Minimization (erm), a framework that critiques the causal emphstructure of a model’s reasoning […]
ComplexMCP: Evaluation of LLM Agents in Dynamic, Interdependent, and Large-Scale Tool Sandbox
arXiv:2605.10787v2 Announce Type: replace Abstract: Current LLM agents are proficient at calling isolated APIs but struggle with the “last mile” of commercial software automation. In real-world scenarios, tools are not independent; they are atomic, interdependent, and prone to environmental noise. We introduce $textbfComplexMCP$, a benchmark designed to evaluate agents in these rigorous conditions. Built on […]
Access Paths for Efficient Ordering with Large Language Models
arXiv:2509.00303v3 Announce Type: replace-cross Abstract: In this work, we present the textttLLM ORDER BY semantic operator as a logical abstraction and conduct a systematic study of its physical implementations. First, we propose several improvements to existing semantic sorting algorithms and introduce a semantic-aware external merge sort algorithm. Our extensive evaluation reveals that no single implementation […]
SPARC: Spatial-Aware Path Planning via Attentive Robot Communication
arXiv:2603.02845v3 Announce Type: replace-cross Abstract: Efficient communication is critical for decentralized Multi-Robot Path Planning (MRPP), yet existing learned communication methods treat all neighboring robots equally regardless of their spatial proximity, leading to diluted attention in congested regions where coordination matters most. We propose Relation enhanced Multi Head Attention (RMHA), a communication mechanism that explicitly embeds […]
Agentic Physical AI toward a Domain-Specific Foundation Model for Nuclear Reactor Control
arXiv:2512.23292v3 Announce Type: replace Abstract: The prevailing paradigm in AI for physical systems (scaling general-purpose foundation models toward universal multimodal reasoning) confronts a fundamental barrier at the control interface. Recent benchmarks show that even frontier vision–language models achieve only 50–53% accuracy on basic quantitative physics tasks, behaving as approximate guessers that preserve semantic plausibility by […]