arXiv:2605.17131v1 Announce Type: cross Abstract: Point cloud stands as the most widely adopted format for representing 3D shapes and scenes due to its simplicity and geometric fidelity. However, its inherent unordered and irregular nature, exacerbated by sensor noise and occlusions, introduces unique challenges for machine learning based methodologies. To combat these issues, diverse strategies have […]
TTE-Flash: Accelerating Reasoning-based Multimodal Representations via Think-Then-Embed Tokens
arXiv:2605.16638v1 Announce Type: new Abstract: Recent research has demonstrated that Universal Multimodal Embedding (UME) benefits significantly from Chain-of-Thought (CoT) reasoning. In this paradigm, a generative model produces explicit reasoning traces for a multimodal query, with the final representation extracted from an embedding token attending to both the query and the reasoning. Despite its effectiveness, the […]
Systematic Evaluation of Vision Transformers for Automated Cervical Cancer Classification: Optimization, Statistical Validation, and Clinical Interpretability
arXiv:2605.17236v1 Announce Type: cross Abstract: Manual Pap smear analysis for cervical cancer screening is limited by inter-observer variability, time constraints, and restricted expert availability. Although convolutional neural networks (CNNs) have automated cervical cell classification, they remain limited in modeling long-range spatial dependencies and often lack clinical interpretability. In this study, Vision Transformer (ViT) architectures were […]
Rethinking GNNs and Missing Features: Challenges, Evaluation and a Robust Solution
arXiv:2601.04855v2 Announce Type: replace-cross Abstract: Handling missing node features is a key challenge for deploying Graph Neural Networks (GNNs) in real-world domains such as healthcare and sensor networks. Existing studies mostly address relatively benign scenarios, namely benchmark datasets with (a) high-dimensional but sparse node features and (b) incomplete data generated under Missing Completely At Random […]
Attention Hijacking: Response Manipulation Across Queries in Vision-Language Models
arXiv:2605.17310v1 Announce Type: cross Abstract: Existing adversarial attacks on vision-language models (VLMs) can steer model outputs toward attacker-specified target responses, but their effectiveness often degrades when the same perturbed input is paired with different textual queries. This paper studies cross-query response manipulation, where a single adversarial example is expected to remain effective across diverse user […]
Sustainable Intelligence for the Wild: Democratizing Ecological Monitoring via Knowledge-Adaptive Edge Expert Agents
arXiv:2605.16671v1 Announce Type: new Abstract: Rapid biodiversity loss underscore the urgency of effective monitoring, yet manual surveys remain resource-intensive. While on-device AI offers a scalable alternative, its performance in the wild is often challenged by environmental variability. Current methods rely heavily on cloud resource, which requires continuous uploading of field data for model retraining. This […]
Progressive Generalization Augmentation with Deeply Coupled RND-PPO and Domain-Prioritized Noise Injection for Robust Crop Management Reinforcement Learning
arXiv:2605.17428v1 Announce Type: cross Abstract: Our preliminary experiments on gym-DSSAT maize irrigation tasks revealed that +/-2 degrees C temperature noise causes an 11.9% reduction in economic returns for PPO policies trained under clean conditions – a systematic robustness deficit that existing research has not adequately addressed. This paper tackles three interconnected limitations impeding practical deployment […]
Identifying Latent Actions and Dynamics from Offline Data via Demonstrator Diversity
arXiv:2603.17577v2 Announce Type: replace-cross Abstract: Can latent actions and environment dynamics be recovered from offline trajectories when actions are never observed? We study this question in a setting where trajectories are action-free but tagged with demonstrator identity. We assume that each demonstrator follows a distinct policy, while the environment dynamics are shared across demonstrators and […]
Rethinking Code Review in the Age of AI: A Vision for Agentic Code Review
arXiv:2605.17548v1 Announce Type: cross Abstract: Code review has evolved for decades, from informal peer checking to today’s pull request (PR) workflows, yet it remains a largely manual, uneven, and cognitively demanding process. The rise of Artificial Intelligence (AI) coding assistants has intensified this challenge: while these tools increase code production velocity, they also expand the […]
LinAlg-Bench: A Forensic Benchmark Revealing Structural Failure Modes in LLM Mathematical Reasoning
arXiv:2605.16675v1 Announce Type: new Abstract: We introduce LinAlg-Bench, a diagnostic benchmark evaluating 10 frontier large language models on structured linear algebra computation across a strict dimensional gradient of 3×3, 4×4, and 5×5 matrices. Spanning 9 task types and 660 SymPy-certified problems, the benchmark exhaustively evaluates 6,600 model outputs. Beyond binary accuracy, LinAlg-Bench introduces a three-stage […]
Fine-tuning Pocket-Aware Diffusion Models via Denoising Policy Optimization
arXiv:2605.17693v1 Announce Type: cross Abstract: Structure-based drug design has been accelerated by pocket-aware 3D generative models, yet most methods primarily fit the training distribution and may fall short of satisfying multiple properties required in real-world therapeutic drug discovery. Recently, increasing attention has focused on structure-based molecule optimization (SBMO), which targets fine-grained control over multiple specified […]
Embodied Multi-Agent Coordination by Aligning World Models Through Dialogue
arXiv:2605.12920v2 Announce Type: replace-cross Abstract: Effective collaboration between embodied agents requires more than acting in a shared environment; it demands communication grounded in each agent’s evolving understanding of the world. When agents can only partially observe their surroundings, coordination without communication is provably hard, but communication can, in principle, bridge this gap by allowing agents […]