arXiv:2603.22150v1 Announce Type: new Abstract: The basic and effective reproduction numbers are widely used metrics for characterizing the dynamics of infectious disease epidemics. However, the interpretation of these numbers is based on the assumption of homogeneous mixing and may not hold in real-world populations where the contact patterns deviate from that assumption. In this paper, […]
Your Robot Will Feel You Now: Empathy in Robots and Embodied Agents
arXiv:2603.20200v1 Announce Type: cross Abstract: The fields of human-robot interaction (HRI) and embodied conversational agents (ECAs) have long studied how empathy could be implemented in machines. One of the major drivers has been the goal of giving multimodal social and emotional intelligence to these artificially intelligent agents, which interact with people through facial expressions, body, […]
Me, Myself, and $pi$ : Evaluating and Explaining LLM Introspection
arXiv:2603.20276v1 Announce Type: new Abstract: A hallmark of human intelligence is Introspection-the ability to assess and reason about one’s own cognitive processes. Introspection has emerged as a promising but contested capability in large language models (LLMs). However, current evaluations often fail to distinguish genuine meta-cognition from the mere application of general world knowledge or text-based […]
AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwidth Collapse
arXiv:2603.20285v1 Announce Type: new Abstract: Cooperative multi-agent methods for embodied AI are almost universally evaluated under idealized communication: zero latency, no packet loss, and unlimited bandwidth. Real-world deployment on robots with wireless links, autonomous vehicles on congested networks, or drone swarms in contested spectrum offers no such guarantees. We introduce AgentComm-Bench, a benchmark suite and […]
Locally Coherent Parallel Decoding in Diffusion Language Models
arXiv:2603.20216v1 Announce Type: cross Abstract: Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models, offering sub-linear generation latency and bidirectional capabilities that are particularly appealing for code generation and editing. Achieving sub-linear latency in discrete DLMs requires predicting multiple tokens in parallel. However, standard DLMs sample tokens independently from conditional […]
FactorSmith: Agentic Simulation Generation via Markov Decision Process Decomposition with Planner-Designer-Critic Refinement
arXiv:2603.20270v1 Announce Type: new Abstract: Generating executable simulations from natural language specifications remains a challenging problem due to the limited reasoning capacity of large language models (LLMs) when confronted with large, interconnected codebases. This paper presents FactorSmith, a framework that synthesizes playable game simulations in code from textual descriptions by combining two complementary ideas: factored […]
Implicit Maximum Likelihood Estimation for Real-time Generative Model Predictive Control
arXiv:2603.13733v2 Announce Type: replace-cross Abstract: Diffusion-based models have recently shown strong performance in trajectory planning, as they are capable of capturing diverse, multimodal distributions of complex behaviors. A key limitation of these models is their slow inference speed, which results from the iterative denoising process. This makes them less suitable for real-time applications such as […]
Transformer-Based Predictive Maintenance for Risk-Aware Instrument Calibration
arXiv:2603.20297v1 Announce Type: cross Abstract: Accurate calibration is essential for instruments whose measurements must remain traceable, reliable, and compliant over long operating periods. Fixed-interval programs are easy to administer, but they ignore that instruments drift at different rates under different conditions. This paper studies calibration scheduling as a predictive maintenance problem: given recent sensor histories, […]
CoVerRL: Breaking the Consensus Trap in Label-Free Reasoning via Generator-Verifier Co-Evolution
arXiv:2603.17775v2 Announce Type: replace-cross Abstract: Label-free reinforcement learning enables large language models to improve reasoning capabilities without ground-truth supervision, typically by treating majority-voted answers as pseudo-labels. However, we identify a critical failure mode: as training maximizes self-consistency, output diversity collapses, causing the model to confidently reinforce systematic errors that evade detection. We term this the […]
Evaluating LLM-Generated Lessons from the Language Learning Students’ Perspective: A Short Case Study on Duolingo
arXiv:2603.18873v2 Announce Type: replace-cross Abstract: Popular language learning applications such as Duolingo use large language models (LLMs) to generate lessons for its users. Most lessons focus on general real-world scenarios such as greetings, ordering food, or asking directions, with limited support for profession-specific contexts. This gap can hinder learners from achieving professional-level fluency, which we […]
InjectFlow: Weak Guides Strong via Orthogonal Injection for Flow Matching
arXiv:2603.20303v1 Announce Type: cross Abstract: Flow Matching (FM) has recently emerged as a leading approach for high-fidelity visual generation, offering a robust continuous-time alternative to ordinary differential equation (ODE) based models. However, despite their success, FM models are highly sensitive to dataset biases, which cause severe semantic degradation when generating out-of-distribution or minority-class samples. In […]
enhancing reasoning accuracy in large language models during inference time
arXiv:2603.21301v1 Announce Type: cross Abstract: Large Language Models (LLMs) often exhibit strong linguistic abilities while remaining unreliable on multi-step reasoning tasks, particularly when deployed without additional training or fine-tuning. In this work, we study inference-time techniques to improve the reasoning accuracy of LLMs. We systematically evaluate three classes of inference-time strategies: (i) self-consistency via stochastic […]