arXiv:2603.16884v1 Announce Type: new Abstract: This work introduces a framework for reconstructing the interaction graph of neuronal networks modeled as multivariate point processes. The methodology performs bivariate inference, identifying synaptic links exclusively from the spike trains of a pair of neurons, without requiring observations of the remaining network activity. We propose a Macro-Micro Extrapolation algorithm […]
YOLO26: An Analysis of NMS-Free End to End Framework for Real-Time Object Detection
arXiv:2601.12882v2 Announce Type: replace-cross Abstract: The “You Only Look Once” (YOLO) framework has long served as a standard for real-time object detection, though traditional iterations have utilized Non-Maximum Suppression (NMS) post-processing, which introduces specific latency and hyperparameter variables. This paper presents a comprehensive architectural analysis of YOLO26, a model that shifts toward a native end-to-end […]
Rel-Zero: Harnessing Patch-Pair Invariance for Robust Zero-Watermarking Against AI Editing
arXiv:2603.17531v1 Announce Type: cross Abstract: Recent advancements in diffusion-based image editing pose a significant threat to the authenticity of digital visual content. Traditional embedding-based watermarking methods often introduce perceptible perturbations to maintain robustness, inevitably compromising visual fidelity. Meanwhile, existing zero-watermarking approaches, typically relying on global image features, struggle to withstand sophisticated manipulations. In this work, […]
PAND: Prompt-Aware Neighborhood Distillation for Lightweight Fine-Grained Visual Classification
arXiv:2602.07768v2 Announce Type: replace-cross Abstract: Distilling knowledge from large Vision-Language Models (VLMs) into lightweight networks is crucial yet challenging in Fine-Grained Visual Classification (FGVC), due to the reliance on fixed prompts and global alignment. To address this, we propose PAND (Prompt-Aware Neighborhood Distillation), a two-stage framework that decouples semantic calibration from structural transfer. First, we […]
Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning
arXiv:2510.04072v4 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has become central to enhancing reasoning in large language models (LLMs). Yet on-policy algorithms such as Group Relative Policy Optimization (GRPO) often suffer in early training: noisy gradients from low-quality rollouts lead to unstable updates and inefficient exploration. We introduce Slow-Fast Policy Optimization (SFPO), a simple yet […]
Omni IIE Bench: Benchmarking the Practical Capabilities of Image Editing Models
arXiv:2603.16944v1 Announce Type: cross Abstract: While Instruction-based Image Editing (IIE) has achieved significant progress, existing benchmarks pursue task breadth via mixed evaluations. This paradigm obscures a critical failure mode crucial in professional applications: the inconsistent performance of models across tasks of varying semantic scales. To address this gap, we introduce Omni IIE Bench, a high-quality, […]
KGS-GCN: Enhancing Sparse Skeleton Sensing via Kinematics-Driven Gaussian Splatting and Probabilistic Topology for Action Recognition
arXiv:2603.16943v1 Announce Type: cross Abstract: Skeleton-based action recognition is widely utilized in sensor systems including human-computer interaction and intelligent surveillance. Nevertheless, current sensor devices typically generate sparse skeleton data as discrete coordinates, which inevitably discards fine-grained spatiotemporal details during highly dynamic movements. Moreover, the rigid constraints of predefined physical sensor topologies hinder the modeling of […]
Evaluating Ill-Defined Tasks in Large Language Models
arXiv:2603.17067v1 Announce Type: cross Abstract: Many evaluations of Large Language Models (LLMs) target tasks that are inherently ill-defined, with unclear input and output spaces and ambiguous success criteria. We analyze why existing evaluation benchmarks and metrics fail to provide reliable or diagnostic signals of model capability for such tasks. We examine two case studies: Complex […]
Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles
arXiv:2603.17111v1 Announce Type: cross Abstract: Ensembling Vision-Language Models (VLMs) from different providers maximizes benchmark accuracy, yet models from the same architectural family share correlated errors that standard voting ignores. We study this structure across 17 VLMs from 8 families on VQAv2, TextVQA, and GQA. Family-correlated errors reduce effective ensemble dimensionality to 2.5-3.6 independent voters and […]
Towards Unsupervised Adversarial Document Detection in Retrieval Augmented Generation Systems
arXiv:2603.17176v1 Announce Type: cross Abstract: Retrieval augmented generation systems have become an integral part of everyday life. Whether in internet search engines, email systems, or service chatbots, these systems are based on context retrieval and answer generation with large language models. With their spread, also the security vulnerabilities increase. Attackers become increasingly focused on these […]
Continual Multimodal Egocentric Activity Recognition via Modality-Aware Novel Detection
arXiv:2603.16970v1 Announce Type: cross Abstract: Multimodal egocentric activity recognition integrates visual and inertial cues for robust first-person behavior understanding. However, deploying such systems in open-world environments requires detecting novel activities while continuously learning from non-stationary streams. Existing methods rely on the main logits for novelty scoring, without fully exploiting the complementary evidence available from individual […]
Dependence Fidelity and Downstream Inference Stability in Generative Models
arXiv:2603.17041v1 Announce Type: cross Abstract: Recent advances in generative AI have led to increasingly realistic synthetic data, yet evaluation criteria remain focused on marginal distribution matching. While these diagnostics assess local realism, they provide limited insight into whether a generative model preserves the multivariate dependence structures governing downstream inference. We introduce covariance-level dependence fidelity as […]