arXiv:2603.17729v1 Announce Type: cross Abstract: Recent advances in Large Vision-Language Models (LVLMs) have enabled training-free Fine-Grained Visual Recognition (FGVR). However, effectively exploiting LVLMs for FGVR remains challenging due to the inherent visual ambiguity of subordinate-level categories. Existing methods predominantly adopt either retrieval-oriented or reasoning-oriented paradigms to tackle this challenge, but both are constrained by two […]
EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing
arXiv:2509.13399v3 Announce Type: replace-cross Abstract: Instruction-based image editing has advanced rapidly, yet reliable and interpretable evaluation remains a bottleneck. Current protocols either (i) depend on paired reference images, resulting in limited coverage and inheriting biases from prior generative models or (ii) rely solely on zero-shot vision language models (VLMs), whose prompt-based assessments of instruction following, […]
Failing on Bias Mitigation: A Case Study on the Challenges of Fairness in Government Data
arXiv:2601.17054v2 Announce Type: replace-cross Abstract: The potential for bias and unfairness in AI-supporting government services raises ethical and legal concerns. Using crime rate prediction with the Bristol City Council data as a case study, we examine how these issues persist. Rather than auditing real-world deployed systems, our goal is to understand why widely adopted bias […]
UNICORN: Ultrasound Nakagami Imaging via Score Matching and Adaptation for Assessing Hepatic Steatosis
arXiv:2603.16942v1 Announce Type: cross Abstract: Ultrasound imaging is an essential first-line tool for assessing hepatic steatosis. While conventional B-mode ultrasound imaging has limitations in providing detailed tissue characterization, ultrasound Nakagami imaging holds promise for visualizing and quantifying tissue scattering in backscattered signals, with potential applications in fat fraction analysis. However, existing methods for Nakagami imaging […]
Macro-Micro Inference: Robust Synaptic Classification via Spike-Triggered Extrapolation
arXiv:2603.16884v1 Announce Type: new Abstract: This work introduces a framework for reconstructing the interaction graph of neuronal networks modeled as multivariate point processes. The methodology performs bivariate inference, identifying synaptic links exclusively from the spike trains of a pair of neurons, without requiring observations of the remaining network activity. We propose a Macro-Micro Extrapolation algorithm […]
YOLO26: An Analysis of NMS-Free End to End Framework for Real-Time Object Detection
arXiv:2601.12882v2 Announce Type: replace-cross Abstract: The “You Only Look Once” (YOLO) framework has long served as a standard for real-time object detection, though traditional iterations have utilized Non-Maximum Suppression (NMS) post-processing, which introduces specific latency and hyperparameter variables. This paper presents a comprehensive architectural analysis of YOLO26, a model that shifts toward a native end-to-end […]
Rel-Zero: Harnessing Patch-Pair Invariance for Robust Zero-Watermarking Against AI Editing
arXiv:2603.17531v1 Announce Type: cross Abstract: Recent advancements in diffusion-based image editing pose a significant threat to the authenticity of digital visual content. Traditional embedding-based watermarking methods often introduce perceptible perturbations to maintain robustness, inevitably compromising visual fidelity. Meanwhile, existing zero-watermarking approaches, typically relying on global image features, struggle to withstand sophisticated manipulations. In this work, […]
PAND: Prompt-Aware Neighborhood Distillation for Lightweight Fine-Grained Visual Classification
arXiv:2602.07768v2 Announce Type: replace-cross Abstract: Distilling knowledge from large Vision-Language Models (VLMs) into lightweight networks is crucial yet challenging in Fine-Grained Visual Classification (FGVC), due to the reliance on fixed prompts and global alignment. To address this, we propose PAND (Prompt-Aware Neighborhood Distillation), a two-stage framework that decouples semantic calibration from structural transfer. First, we […]
Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning
arXiv:2510.04072v4 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has become central to enhancing reasoning in large language models (LLMs). Yet on-policy algorithms such as Group Relative Policy Optimization (GRPO) often suffer in early training: noisy gradients from low-quality rollouts lead to unstable updates and inefficient exploration. We introduce Slow-Fast Policy Optimization (SFPO), a simple yet […]
Omni IIE Bench: Benchmarking the Practical Capabilities of Image Editing Models
arXiv:2603.16944v1 Announce Type: cross Abstract: While Instruction-based Image Editing (IIE) has achieved significant progress, existing benchmarks pursue task breadth via mixed evaluations. This paradigm obscures a critical failure mode crucial in professional applications: the inconsistent performance of models across tasks of varying semantic scales. To address this gap, we introduce Omni IIE Bench, a high-quality, […]
KGS-GCN: Enhancing Sparse Skeleton Sensing via Kinematics-Driven Gaussian Splatting and Probabilistic Topology for Action Recognition
arXiv:2603.16943v1 Announce Type: cross Abstract: Skeleton-based action recognition is widely utilized in sensor systems including human-computer interaction and intelligent surveillance. Nevertheless, current sensor devices typically generate sparse skeleton data as discrete coordinates, which inevitably discards fine-grained spatiotemporal details during highly dynamic movements. Moreover, the rigid constraints of predefined physical sensor topologies hinder the modeling of […]
Evaluating Ill-Defined Tasks in Large Language Models
arXiv:2603.17067v1 Announce Type: cross Abstract: Many evaluations of Large Language Models (LLMs) target tasks that are inherently ill-defined, with unclear input and output spaces and ambiguous success criteria. We analyze why existing evaluation benchmarks and metrics fail to provide reliable or diagnostic signals of model capability for such tasks. We examine two case studies: Complex […]