arXiv:2605.22795v2 Announce Type: replace-cross Abstract: We propose and analyze a conservative drifting method for one-step generative modeling. The method replaces the original displacement-based drifting velocity by a kernel density estimator (KDE)-gradient velocity, namely the difference of the kernel-smoothed data score and the kernel-smoothed model score. This velocity is a gradient field, addressing the non-conservatism issue […]
Intent Signal Theory: A Computational Framework for Intent-State Control in Human-AI Interaction
arXiv:2605.25058v1 Announce Type: cross Abstract: Current AI interaction models treat the prompt as the primary object of exchange, omitting a critical layer: the user’s latent source intent, the goal state preceding and motivating the prompt. Here we introduce Intent Signal Theory (IST), a computational framework that formalises this missing intent layer. IST distinguishes four objects […]
DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning
arXiv:2605.23939v1 Announce Type: new Abstract: Web agents require both high-level reasoning (for task decomposition) and low-level interactions (for page elements manipulation) to conduct different tasks. However, these knowledge types differ fundamentally: reasoning knowledge (e.g., booking a flight requires first searching for routes) is abstract and transferable across websites, while interaction knowledge (e.g., clicking the Search […]
STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media
arXiv:2605.25162v1 Announce Type: cross Abstract: Large language models for vertical domains are bottlenecked by the scarcity of complex, domain-specific task-oriented dialogues. Existing data acquisition pipelines face a persistent trilemma: expert annotation is expensive, real-world service conversations are constrained by privacy and commercial restrictions, and static corpora quickly become temporally stale. We propose Stream, a data-centric […]
A Controlled Synthetic Benchmark for Educational Aspect-Based Sentiment Analysis
arXiv:2605.25502v1 Announce Type: cross Abstract: Educational aspect-based sentiment analysis (ABSA) can support course improvement, but public aspect-labeled student feedback remains scarce because educational reviews are private, institution-specific, and expensive to annotate. This study introduces a controlled synthetic benchmark for educational ABSA built from 10,000 synthetic course reviews with explicit train-validation-test splits and a 20-aspect pedagogical […]
Quantifying Empirical Compute-Supervision Tradeoffs in RLVR
arXiv:2605.25252v1 Announce Type: cross Abstract: Reinforcement learning with verifiable rewards (RLVR) has become a standard paradigm for post-training language models, but in practice, verifiers are rarely perfect. Recent theoretical work predicts that verifier noise affects the rate of learning but not its final outcome, implying that sufficient compute should close any gap induced by imperfect […]
Residual Drift Dominates Contradiction in Multi-Turn Constraint Reasoning
arXiv:2605.23940v1 Announce Type: new Abstract: How do multi-turn reasoning systems fail? The expected answer is logical contradiction, in which the system’s maintained state becomes unsatisfiable. We show that the dominant mode is instead satisfiable drift, where the internal state stays consistent while the returned answer silently violates prior commitments. We build DRIFT-Bench (Decomposing Reasoning Into […]
SAMark: A Self-Anchored Text Watermarking with Paragraph-Level Paraphrase Robustness
arXiv:2605.25796v1 Announce Type: cross Abstract: Semantic-level watermarking (SWM) improves robustness against text modifications by treating sentences as the basic unit. However, robustness to paragraph-level paraphrasing remains difficult because such attacks globally disrupt watermark signals by changing sentence order. In this work, we propose SAMark, a self-anchored watermarking framework that removes the dependency on sentence order […]
SA-Kura: An Energy-Efficient Systolic Array Accelerator for Locally-Coupled Kuramoto Drift in Diffusion Sampling
arXiv:2605.24016v1 Announce Type: cross Abstract: Diffusion inference remains costly for edge deployment, yet existing accelerators focus almost exclusively on score networks because standard drift is merely a trivial linear scaling. Kuramoto orientation diffusion replaces this trivial drift with locally coupled phase interactions, improving sampling efficiency but introducing a new hardware bottleneck: a center-dependent nonlinear 5 […]
MEMOR-E: In-Context and Fine-Tuned LLM Personalization for Alzheimer’s Assistive Robotics
arXiv:2605.23941v1 Announce Type: new Abstract: Alzheimer’s disease is a neurodegenerative disorder marked by progressive declines in memory and language that reduce independence in daily life, motivating socially assistive robotic support. This paper presents MEMOR-E, a mobile quadruped robot with an interactive tablet interface that assists patients and caregivers through medication reminders, routine guidance, memory oriented […]
More Skills, Worse Agents? Skill Shadowing Degrades Performance When Expanding Skill Libraries
arXiv:2605.24050v1 Announce Type: cross Abstract: Skill libraries allow LLM agents to load task-specific instructions on demand, letting non-expert users solve domain-specific tasks through natural language without knowing which skills exist or how they work. However, performance degrades as libraries grow — by up to 21% when scaling from a small set of helpful skills to […]
Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals
arXiv:2605.26045v1 Announce Type: cross Abstract: Activation oracles aim to make the activations of other models legible to humans and yield promising results compared to white-box interpretability techniques. However, uncertainty quantification (UQ) for the natural-language outputs of such activation oracles is so far understudied. Here, we investigate 6 different methods for estimating the confidence of activation […]