arXiv:2604.15768v2 Announce Type: replace-cross Abstract: AI-driven methods have demonstrated considerable success in tackling the central challenge of accurately solving the Schr”odinger equation for complex many-body systems. Among neural network quantum state (NNQS) approaches, the NNQS-SCI (Selected Configuration Interaction) method stands out as a state-of-the-art technique, recognized for its high accuracy and scalability. However, its application […]
B-PASTE: Beam-Aware Pattern-Guided Speculative Execution for Resource-Constrained LLM Agents
arXiv:2604.16469v1 Announce Type: cross Abstract: LLM agents execute in an interleaved reasoning-and-action loop, where future tool calls cannot be launched until the current reasoning step completes. This serial dependency inflates end-to-end latency and leaves the model idle while waiting for tool execution. Prior work, Pattern-Aware Speculative Tool Execution (PASTE), mitigates this bottleneck by speculating likely […]
DexWorldModel: Causal Latent World Modeling towards Automated Learning of Embodied Tasks
arXiv:2604.16484v1 Announce Type: cross Abstract: Deploying generative World-Action Models for manipulation is severely bottlenecked by redundant pixel-level reconstruction, $mathcalO(T)$ memory scaling, and sequential inference latency. We introduce the Causal Latent World Model (CLWM), which employs DINOv3 features as generative targets to disentangle interaction semantics from visual noise, yielding highly robust domain generalization. To overcome memory […]
Filling the Gaps: Selective Knowledge Augmentation for LLM Recommenders
arXiv:2604.07825v2 Announce Type: replace-cross Abstract: Large language models (LLMs) have recently emerged as powerful training-free recommenders. However, their knowledge of individual items is inevitably uneven due to imbalanced information exposure during pretraining, a phenomenon we refer to as knowledge gap problem. To address this, most prior methods have employed a naive uniform augmentation that appends […]
TIP: Token Importance in On-Policy Distillation
arXiv:2604.14084v2 Announce Type: replace-cross Abstract: On-policy knowledge distillation (OPD) trains a student on its own rollouts under token-level supervision from a teacher. Not all token positions matter equally, but existing views of token importance are incomplete. We ask a direct question: which tokens carry the most useful learning signal in OPD? Our answer is that […]
What Makes AI Research Replicable? Executable Knowledge Graphs as Scientific Knowledge Representations
arXiv:2510.17795v3 Announce Type: replace-cross Abstract: Replicating AI research is a crucial yet challenging task for large language model (LLM) agents. Existing approaches often struggle to generate executable code, primarily due to insufficient background knowledge and the limitations of retrieval-augmented generation (RAG) methods, which fail to capture latent technical details hidden in referenced papers. Furthermore, previous […]
LOGICAL-COMMONSENSEQA: A Benchmark for Logical Commonsense Reasoning
arXiv:2601.16504v3 Announce Type: replace-cross Abstract: Commonsense reasoning often involves evaluating multiple plausible interpretations rather than selecting a single atomic answer, yet most benchmarks rely on single-label evaluation, obscuring whether statements are jointly plausible, mutually exclusive, or jointly implausible. We introduce LOGICAL-COMMONSENSEQA, a benchmark that reframes commonsense reasoning as logical composition over pairs of atomic statements […]
From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models
arXiv:2601.15690v2 Announce Type: replace Abstract: While Large Language Models (LLMs) show remarkable capabilities, their unreliability remains a critical barrier to deployment in high-stakes domains. This survey charts a functional evolution in addressing this challenge: the evolution of uncertainty from a passive diagnostic metric to an active control signal guiding real-time model behavior. We demonstrate how […]
Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values
arXiv:2506.00079v2 Announce Type: replace-cross Abstract: The rapid integration of Large Language Models (LLMs) in high-stakes decision-making — such as allocating scarce resources like donor organs — raises critical questions about their alignment with human moral values. We systematically evaluate the behavior of several prominent LLMs against human preferences in kidney allocation scenarios and show that […]
AgileLog: A Forkable Shared Log for Agents on Data Streams
arXiv:2604.14590v2 Announce Type: replace-cross Abstract: In modern data-streaming systems, alongside traditional programs, a new type of entity has emerged that can interact with streaming data: AI agents. Unlike traditional programs, AI agents use LLM reasoning to accomplish high-level tasks specified in natural language over streaming data. Unfortunately, current streaming systems cannot fully support agents: they […]
Different Paths to Harmful Compliance: Behavioral Side Effects and Mechanistic Divergence Across LLM Jailbreaks
arXiv:2604.18510v1 Announce Type: cross Abstract: Open-weight language models can be rendered unsafe through several distinct interventions, but the resulting models may differ substantially in capabilities, behavioral profile, and internal failure mode. We study behavioral and mechanistic properties of jailbroken models across three unsafe routes: harmful supervised fine-tuning (SFT), harmful reinforcement learning with verifiable rewards (RLVR), […]
STEP-Parts: Geometric Partitioning of Boundary Representations for Large-Scale CAD Processing
arXiv:2604.14927v2 Announce Type: replace-cross Abstract: Many CAD learning pipelines discretize Boundary Representations (B-Reps) into triangle meshes, discarding analytic surface structure and topological adjacency and thereby weakening consistent instance-level analysis. We present STEP-Parts, a deterministic CAD-to-supervision toolchain that extracts geometric instance partitions directly from raw STEP B-Reps and transfers them to tessellated carriers through retained source-face […]