arXiv:2601.12781v2 Announce Type: replace Abstract: Referring Expression Comprehension (REC) aims to localize the image region corresponding to a natural language query. Recent neuro-symbolic REC approaches leverage large language models (LLMs) and vision-language models (VLMs) to perform compositional reasoning, decomposing queries into structured programs and executing them step-by-step. While such approaches achieve interpretable reasoning and strong […]
PolicySim: An LLM-Based Agent Social Simulation Sandbox for Proactive Policy Optimization
arXiv:2603.19649v1 Announce Type: cross Abstract: Social platforms serve as central hubs for information exchange, where user behaviors and platform interventions jointly shape opinions. However, intervention policies like recommendation and content filtering, can unintentionally amplify echo chambers and polarization, posing significant societal risks. Proactively evaluating the impact of such policies is therefore crucial. Existing approaches primarily […]
Pseudo-Simulation for Autonomous Driving
arXiv:2506.04218v3 Announce Type: replace-cross Abstract: Existing evaluation paradigms for Autonomous Vehicles (AVs) face critical limitations. Real-world evaluation is often challenging due to safety concerns and a lack of reproducibility, whereas closed-loop simulation can face insufficient realism or high computational costs. Open-loop evaluation, while being efficient and data-driven, relies on metrics that generally overlook compounding errors. […]
LHAW: Controllable Underspecification for Long-Horizon Tasks
arXiv:2602.10525v2 Announce Type: replace-cross Abstract: Long-horizon workflow agents that operate effectively over extended periods are essential for truly autonomous systems. Their reliable execution critically depends on the ability to reason through ambiguous situations in which clarification seeking is necessary to ensure correct task execution. However, progress is limited by the lack of scalable, task-agnostic frameworks […]
Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
arXiv:2511.16665v3 Announce Type: replace-cross Abstract: The emergence of Large Language Models (LLMs) with strong reasoning capabilities marks a significant milestone, unlocking new frontiers in complex problem-solving. However, training these reasoning models, typically using Reinforcement Learning (RL), encounters critical efficiency bottlenecks: response generation during RL training exhibits a persistent long-tail distribution, where a few very long […]
OmniDiT: Extending Diffusion Transformer to Omni-VTON Framework
arXiv:2603.19643v1 Announce Type: cross Abstract: Despite the rapid advancement of Virtual Try-On (VTON) and Try-Off (VTOFF) technologies, existing VTON methods face challenges with fine-grained detail preservation, generalization to complex scenes, complicated pipeline, and efficient inference. To tackle these problems, we propose OmniDiT, an omni Virtual Try-On framework based on the Diffusion Transformer, which combines try-on […]
Sensing Without Colocation: Operator-Based Virtual Instrumentation for Domains Beyond Physical Reach
arXiv:2510.18041v2 Announce Type: replace-cross Abstract: Classical sensing rests on one foundational assumption: the quantity of interest must be colocated with the measurement device. This is not an engineering convenience. It is the organizing principle of every instrumentation standard developed over the past century, and it fails completely at aviation altitude, where no physical sensor can […]
IFNSO: Iteration-Free Newton-Schulz Orthogonalization
arXiv:2602.02500v3 Announce Type: replace-cross Abstract: The Newton-Schulz (NS) iteration has become a key technique for orthogonalization in optimizers such as Muon and for optimization on the Stiefel manifold. Despite its effectiveness, the conventional NS iteration incurs significant computational overhead due to repeated high-dimensional matrix multiplications. To overcome these limitations, we propose Iteration-Free Newton-Schulz Orthogonalization (IFNSO), […]
StealthRL: Reinforcement Learning Paraphrase Attacks for Multi-Detector Evasion of AI-Text Detectors
arXiv:2602.08934v2 Announce Type: replace-cross Abstract: AI-text detectors face a critical robustness challenge: adversarial paraphrasing attacks that preserve semantics while evading detection. We introduce StealthRL, a reinforcement learning framework that stress-tests detector robustness under realistic adversarial conditions. StealthRL trains a paraphrase policy against a multi-detector ensemble using Group Relative Policy Optimization (GRPO) with LoRA adapters on […]
MetaCues: Enabling Critical Engagement with Generative AI for Information Seeking and Sensemaking
arXiv:2603.19634v1 Announce Type: cross Abstract: Generative AI (GenAI) search tools are increasingly used for information seeking, yet their design tends to encourage cognitive offloading, which may lead to passive engagement, selective attention, and informational homogenization. Effective use requires metacognitive engagement to craft good prompts, verify AI outputs, and critically engage with information. We developed MetaCues, […]
S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition
arXiv:2603.18062v2 Announce Type: replace-cross Abstract: Skeleton-based action recognition is crucial for multimedia applications but heavily relies on power-hungry Artificial Neural Networks (ANNs), limiting their deployment on resource-constrained edge devices. Spiking Neural Networks (SNNs) provide an energy-efficient alternative; however, existing spiking models for skeleton data often compromise the intrinsic sparsity of SNNs by resorting to dense […]
Accelerating Large-Scale Cheminformatics Using a Byte-Offset Indexing Architecture for Terabyte-Scale Data Integration
arXiv:2601.18921v2 Announce Type: replace-cross Abstract: The integration of large-scale chemical databases represents a critical bottleneck in modern cheminformatics research, particularly for machine learning applications requiring high-quality, multi-source validated datasets. This paper presents a case study of integrating three major public chemical repositories: PubChem (176 million compounds), ChEMBL, and eMolecules, to construct a curated dataset for […]