arXiv:2603.25328v1 Announce Type: new Abstract: Automated Vehicle (AV) control in mixed traffic, where AVs coexist with human-driven vehicles, poses significant challenges in balancing safety, efficiency, comfort, fuel efficiency, and compliance with traffic rules while capturing heterogeneous driver behavior. Traditional car-following models, such as the Intelligent Driver Model (IDM), often struggle to generalize across diverse traffic […]
Seeking Physics in Diffusion Noise
arXiv:2603.14294v2 Announce Type: replace-cross Abstract: Do video diffusion models encode signals predictive of physical plausibility? We probe intermediate denoising representations of a pretrained Diffusion Transformer (DiT) and find that physically plausible and implausible videos are partially separable in mid-layer feature space across noise levels. This separability cannot be fully attributed to visual quality or generator […]
Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models
arXiv:2603.25412v1 Announce Type: new Abstract: Large language models (LLMs) increasingly rely on explicit chain-of-thought (CoT) reasoning to solve complex tasks, yet the safety of the reasoning process itself remains largely unaddressed. Existing work on LLM safety focuses on content safety–detecting harmful, biased, or factually incorrect outputs — and treats the reasoning chain as an opaque […]
A Decade-Scale Benchmark Evaluating LLMs’ Clinical Practice Guidelines Detection and Adherence in Multi-turn Conversations
arXiv:2603.25196v1 Announce Type: cross Abstract: Clinical practice guidelines (CPGs) play a pivotal role in ensuring evidence-based decision-making and improving patient outcomes. While Large Language Models (LLMs) are increasingly deployed in healthcare scenarios, it is unclear to which extend LLMs could identify and adhere to CPGs during conversations. To address this gap, we introduce CPGBench, an […]
Cross-Model Disagreement as a Label-Free Correctness Signal
arXiv:2603.25450v1 Announce Type: new Abstract: Detecting when a language model is wrong without ground truth labels is a fundamental challenge for safe deployment. Existing approaches rely on a model’s own uncertainty — such as token entropy or confidence scores — but these signals fail critically on the most dangerous failure mode: confident errors, where a […]
UtilityMax Prompting: A Formal Framework for Multi-Objective Large Language Model Optimization
arXiv:2603.11583v2 Announce Type: replace-cross Abstract: The success of a Large Language Model (LLM) task depends heavily on its prompt. Most use-cases specify prompts using natural language, which is inherently ambiguous when multiple objectives must be simultaneously satisfied. In this paper we introduce UtilityMax Prompting, a framework that specifies tasks using formal mathematical language. We reconstruct […]
Modeling the mutational dynamics of very short tandem repeats
arXiv:2603.25628v1 Announce Type: new Abstract: Short tandem repeats (STRs) are low-entropy regions in the genome, consisting of a short (1-6 bp) unit that is consecutively repeated multiple times. They are known for high mutational instability, due to so-called stutter-mutations, in which the number of units in the run increases or descreases. In particular, STRs with […]
Probing the Lack of Stable Internal Beliefs in LLMs
arXiv:2603.25187v1 Announce Type: cross Abstract: Persona-driven large language models (LLMs) require consistent behavioral tendencies across interactions to simulate human-like personality traits, such as persistence or reliability. However, current LLMs often lack stable internal representations that anchor their responses over extended dialogues. This work explores whether LLMs can maintain “implicit consistency”, defined as persistent adherence to […]
A Public Theory of Distillation Resistance via Constraint-Coupled Reasoning Architectures
arXiv:2603.25022v1 Announce Type: new Abstract: Knowledge distillation, model extraction, and behavior transfer have become central concerns in frontier AI. The main risk is not merely copying, but the possibility that useful capability can be transferred more cheaply than the governance structure that originally accompanied it. This paper presents a public, trade-secret-safe theoretical framework for reducing […]
Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI
arXiv:2603.11413v3 Announce Type: replace-cross Abstract: Ramaswamy et al. reported in Nature Medicine that ChatGPT Health under-triages 51.6% of emergencies, concluding that consumer-facing AI triage poses safety risks. However, their evaluation used an exam-style protocol — forced A/B/C/D output, knowledge suppression, and suppression of clarifying questions — that differs fundamentally from how consumers use health chatbots. […]
From Stateless to Situated: Building a Psychological World for LLM-Based Emotional Support
arXiv:2603.25031v1 Announce Type: new Abstract: In psychological support and emotional companionship scenarios, the core limitation of large language models (LLMs) lies not merely in response quality, but in their reliance on local next-token prediction, which prevents them from maintaining the temporal continuity, stage awareness, and user consent boundaries required for multi-turn intervention. This stateless characteristic […]
Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model
arXiv:2603.25184v1 Announce Type: cross Abstract: Reinforcement learning (RL) has become essential for post-training large language models (LLMs) in reasoning tasks. While scaling rollouts can stabilize training and enhance performance, the computational overhead is a critical issue. In algorithms like GRPO, multiple rollouts per prompt incur prohibitive costs, as a large portion of prompts provide negligible […]