arXiv:2604.10597v3 Announce Type: replace-cross Abstract: Mamba selective state space models (SSMs) provide linear-time sequence modeling but remain sensitive to selective-scan chunk scheduling. We present COREY, a emphconcept-and-feasibility runtime scheduler that maps fixed-bin activation entropy to chunk size. We evaluate COREY in three tiers: a prototype cost model, real-checkpoint kernel timing, and routed end-to-end ablations on […]
Combining Trained Models in Reinforcement Learning
arXiv:2605.02159v1 Announce Type: cross Abstract: Deep reinforcement learning (DRL) has delivered strong results in domains such as Atari and Go, but it still suffers from high sample cost and weak transfer beyond the training setting. A common response is to reuse information from previously trained models through transfer, distillation, ensemble methods, or federated training instead […]
From Prompt to Physical Actuation: Holistic Threat Modeling of LLM-Enabled Robotic Systems
arXiv:2604.27267v2 Announce Type: replace-cross Abstract: As large language models are integrated into autonomous robotic systems for task planning and control, compromised inputs or unsafe model outputs can propagate through the planning pipeline to physical-world consequences. Although prior work has studied robotic cybersecurity, adversarial perception attacks, and LLM safety independently, no existing study traces how these […]
Efficient Preference Poisoning Attack on Offline RLHF
arXiv:2605.02495v1 Announce Type: cross Abstract: Offline Reinforcement Learning from Human Feedback (RLHF) pipelines such as Direct Preference Optimization (DPO) train on a pre-collected preference dataset, which makes them vulnerable to preference poisoning attack. We study label flip attacks against log-linear DPO. We first illustrate that flipping one preference label induces a parameter-independent shift in the […]
IConFace: Identity-Structure Asymmetric Conditioning for Unified Reference-Aware Face Restoration
arXiv:2605.02814v1 Announce Type: cross Abstract: Blind face restoration is highly ill-posed under severe degradation, where identity-critical details may be missing from the degraded input. Same-identity references reduce this ambiguity, but mismatched pose, expression, illumination, age, makeup, or local facial states can lead to overuse of reference appearance. We propose textbfIConFace, a unified reference-aware and no-reference […]
Structural Generalization on SLOG without Hand-Written Rules
arXiv:2604.26157v2 Announce Type: replace-cross Abstract: Structural generalization in semantic parsing requires systems to apply learned compositional rules to novel structural combinations. Existing approaches either rely on hand-written algebraic rules (AM-Parser) or fail to generalize structurally (Transformer-based models). We present an alternative requiring no hand-written compositional rules, based on a neural cellular automaton (NCA) with a […]
SCALER:Synthetic Scalable Adaptive Learning Environment for Reasoning
arXiv:2601.04809v5 Announce Type: replace Abstract: Reinforcement learning (RL) offers a principled way to enhance the reasoning capabilities of large language models, yet its effectiveness hinges on training signals that remain informative as models evolve. In practice, RL progress often slows when task difficulty becomes poorly aligned with model capability, or when training is dominated by […]
TRAP: Tail-aware Ranking Attack for World-Model Planning
arXiv:2605.01950v1 Announce Type: cross Abstract: World models enable long-horizon planning by internally generating and evaluating imagined trajectories, making them a promising foundation for generalist agents. However, this imagination-driven decision process also introduces new security risks. Existing backdoor attacks typically aim to manipulate local features, one-step predictions, or instantaneous policy outputs. While such objectives may suffice […]
From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench
arXiv:2604.15037v3 Announce Type: replace Abstract: Recent advancements in LLM agents are gradually shifting from reactive, text-based paradigms toward proactive, multimodal interaction. However, existing benchmarks primarily focus on reactive responses, overlooking the complexities of proactive intervention and monitoring. To bridge this gap, we introduce ProVoice-Bench, the first evaluation framework specifically designed for proactive voice agents, featuring […]
From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills
arXiv:2604.24026v4 Announce Type: replace-cross Abstract: Large language model (LLM) agents increasingly rely on reusable skills: capability packages that combine instructions, control flow, constraints, and tool calls. In current agent systems, however, skills are still represented by text-heavy artifacts, mainly SKILL.md-style documents whose machine-usable evidence remains embedded largely in natural-language descriptions. As a result, skill-centered agent […]
MINT: Multi-Vector Search Index Tuning
arXiv:2504.20018v3 Announce Type: replace-cross Abstract: Vector search plays a crucial role in many real-world applications. In addition to single-vector search, multi-vector search becomes important for multi-modal and multi-feature scenarios today. In a multi-vector database, each row is an item, each column represents a feature of items, and each cell is a high-dimensional vector. In multi-vector […]
Phone2Act: A Low-Cost, Hardware-Agnostic Teleoperation System for Scalable VLA Data Collection
arXiv:2605.01948v1 Announce Type: cross Abstract: Collecting diverse, high-quality manipulation data for Vision-Language-Action (VLA) model training remains prohibitively expensive for many research groups, as existing teleoperation frameworks rely on specialized hardware or are tightly coupled to specific robot platforms. We present Phone2Act, a low-cost, hardware-agnostic teleoperation framework that transforms a commodity smartphone into a 6-DoF robot […]