arXiv:2601.22384v2 Announce Type: replace-cross Abstract: Graphs provide a natural representation of relational structure that arises across diverse domains. Despite this ubiquity, graph structure is typically learned in a modality- and task-isolated manner, where graph representations are constructed within individual task contexts and discarded thereafter. As a result, structural regularities across modalities and tasks are repeatedly […]
Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty
arXiv:2603.15500v2 Announce Type: replace Abstract: LLMs often exhibit Aha moments such as self-correction after tokens like “Wait,” yet the underlying mechanism remains unclear. Standard LLMs collapse mainly through silent divergence, where trajectories drift from the correct answer yet remain locally coherent, so no explicit error triggers reactive self-correction. We introduce an information-theoretic framework that separates […]
FrontierOR: Benchmarking LLMs’ Capacity for Efficient Algorithm Design in Large-Scale Optimization
arXiv:2605.25246v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly used for optimization modeling and solver-code generation, yet practical operations research and optimization problems often require a harder capability: designing scalable algorithms that exploit problem structure and outperform direct formulation-and-solve baselines. Existing benchmarks are limited to small or simplified examples far below real-world scale […]
When Eyes Betray AI: Social Gaze Consistency as a Semantic Cue for AI-Generated Image Detection
arXiv:2605.27348v1 Announce Type: cross Abstract: Recent generative models have largely closed the gap on low-level artifacts – pixel fingerprints, frequency anomalies, upsampling traces – particularly in person-centric and partial-edit settings where the manipulated region is small and surrounded by photometrically authentic content. We introduce Social Gaze Consistency, a high-level semantic cue defined as the mutual […]
Identifiable Token Correspondence for World Models
arXiv:2605.16457v3 Announce Type: replace-cross Abstract: Token-based transformer world models have shown strong performance in visual reinforcement learning, but often suffer from temporal inconsistency in long-horizon rollouts, including object duplication, disappearance, and transmutation. A key reason is that most existing approaches treat next-frame prediction purely as a token generation problem, without considering the persistence of tokens […]
LEC: Linear Expectation Constraints for Selection-Conditioned Risk Control in Selective Prediction and Routing Systems
arXiv:2512.01556v3 Announce Type: replace Abstract: Foundation models often generate unreliable answers, while heuristic uncertainty estimators fail to fully distinguish correct from incorrect outputs, causing users to accept erroneous answers without any statistical guarantee. We address this problem through selection-conditioned risk control, aiming to ensure that an accepted prediction has an error probability no larger than […]
From Feasible to Practical: Pareto-Optimal Synthesis Planning
arXiv:2605.07521v2 Announce Type: replace Abstract: Current computer-aided synthesis planning (CASP) methods often treat retrosynthesis as solved once a single feasible route is identified, focusing primarily on convergence or shortest-path metrics. This view is misaligned with real-world practice, where chemists must balance competing objectives such as cost, sustainability, toxicity, and overall yield. To address this, we […]
Self-Cascaded Diffusion Models for Arbitrary-Scale Image Super-Resolution
arXiv:2506.07813v2 Announce Type: replace-cross Abstract: Arbitrary-scale image super-resolution aims to upsample images to any desired resolution, offering greater flexibility than traditional fixed-scale super-resolution. Recent approaches based on regression-based or generative models have shown promising results but often suffer from scale inconsistency due to their single-stage formulation, which must handle a wide range of scaling factors […]
Mechanistic Interpretability of Antibody Language Models Using SAEs
arXiv:2512.05794v3 Announce Type: replace-cross Abstract: Sparse autoencoders (SAEs) are a mechanistic interpretability technique that have been used to provide insight into learned concepts within large protein language models. Here, we employ TopK and Ordered SAEs to investigate autoregressive antibody language models, and steer their generation. We show that TopK SAEs can reveal biologically meaningful latent […]
MedCollab: IBIS-Guided Multi-Agent Collaboration with Hierarchical Disease Relation Chains for Clinical Diagnosis
arXiv:2603.01131v2 Announce Type: replace-cross Abstract: Large language models (LLMs) have shown promise in clinical diagnosis but remain limited by unreliable report generation, weak evidence grounding, and opaque reasoning. We propose MedCollab, an IBIS-guided multi-agent framework for full-cycle clinical diagnosis and diagnostic report generation. Mimicking hospital consultation, MedCollab dynamically recruits specialist and exam agents from patient […]
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language
arXiv:2604.19667v2 Announce Type: replace-cross Abstract: At present, executable visual workflows have emerged as a mainstream paradigm in real-world industrial deployments, offering strong reliability and controllability. However, in current practice, such workflows are almost entirely constructed through manual engineering: developers must carefully design workflows, write prompts for each step, and repeatedly revise the logic as requirements […]
Tracing the Dynamics of Refusal: Exploiting Latent Refusal Trajectories for Robust Jailbreak Detection
arXiv:2605.02958v2 Announce Type: replace-cross Abstract: Representation Engineering analyses often characterize refusal using static directions extracted from terminal or pooled representations. We ask whether this view misses how refusal is constructed across layer-token positions. Using causal tracing, we identify a textitRefusal Trajectory: a sparse upstream activation pattern that often persists even when attacks such as GCG […]