arXiv:2508.21052v2 Announce Type: replace-cross Abstract: We introduce FakeParts, a new class of deepfakes characterized by subtle, localized manipulations to specific spatial regions or temporal segments of otherwise authentic videos. Unlike fully synthetic content, these partial manipulations – ranging from altered facial expressions to object substitutions and background modifications – blend seamlessly with real elements, making […]
PathBench-MIL: A Comprehensive AutoML and Benchmarking Framework for Multiple Instance Learning in Histopathology
arXiv:2512.17517v1 Announce Type: cross Abstract: We introduce PathBench-MIL, an open-source AutoML and benchmarking framework for multiple instance learning (MIL) in histopathology. The system automates end-to-end MIL pipeline construction, including preprocessing, feature extraction, and MIL-aggregation, and provides reproducible benchmarking of dozens of MIL models and feature extractors. PathBench-MIL integrates visualization tooling, a unified configuration system, and […]
Machine Learning-Driven Predictive Resource Management in Complex Science Workflows
arXiv:2509.11512v2 Announce Type: replace-cross Abstract: The collaborative efforts of large communities in science experiments, often comprising thousands of global members, reflect a monumental commitment to exploration and discovery. Recently, advanced and complex data processing has gained increasing importance in science experiments. Data processing workflows typically consist of multiple intricate steps, and the precise specification of […]
Learning Spatio-Temporal Feature Representations for Video-Based Gaze Estimation
arXiv:2512.17673v1 Announce Type: cross Abstract: Video-based gaze estimation methods aim to capture the inherently temporal dynamics of human eye gaze from multiple image frames. However, since models must capture both spatial and temporal relationships, performance is limited by the feature representations within a frame but also between multiple frames. We propose the Spatio-Temporal Gaze Network […]
Pix2NPHM: Learning to Regress NPHM Reconstructions From a Single Image
arXiv:2512.17773v1 Announce Type: cross Abstract: Neural Parametric Head Models (NPHMs) are a recent advancement over mesh-based 3d morphable models (3DMMs) to facilitate high-fidelity geometric detail. However, fitting NPHMs to visual inputs is notoriously challenging due to the expressive nature of their underlying latent space. To this end, we propose Pix2NPHM, a vision transformer (ViT) network […]
The Semantic Illusion: Certified Limits of Embedding-Based Hallucination Detection in RAG Systems
arXiv:2512.15068v2 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) systems remain susceptible to hallucinations despite grounding in retrieved evidence. While current detection methods leverage embedding similarity and natural language inference (NLI), their reliability in safety-critical settings remains unproven. We apply conformal prediction to RAG hallucination detection, transforming heuristic scores into decision sets with finite-sample coverage guarantees […]
Re-Depth Anything: Test-Time Depth Refinement via Self-Supervised Re-lighting
arXiv:2512.17908v1 Announce Type: cross Abstract: Monocular depth estimation remains challenging as recent foundation models, such as Depth Anything V2 (DA-V2), struggle with real-world images that are far from the training distribution. We introduce Re-Depth Anything, a test-time self-supervision framework that bridges this domain gap by fusing DA-V2 with the powerful priors of large-scale 2D diffusion […]
More Consistent Accuracy PINN via Alternating Easy-Hard Training
arXiv:2512.17607v1 Announce Type: cross Abstract: Physics-informed neural networks (PINNs) have recently emerged as a prominent paradigm for solving partial differential equations (PDEs), yet their training strategies remain underexplored. While hard prioritization methods inspired by finite element methods are widely adopted, recent research suggests that easy prioritization can also be effective. Nevertheless, we find that both […]
Replace, Don’t Expand: Mitigating Context Dilution in Multi-Hop RAG via Fixed-Budget Evidence Assembly
arXiv:2512.10787v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) systems often fail on multi-hop queries when the initial retrieval misses a bridge fact. Prior corrective approaches, such as Self-RAG, CRAG, and Adaptive-$k$, typically address this by textitadding more context or pruning existing lists. However, simply expanding the context window often leads to textbfcontext dilution, where distractors […]
SportsGPT: An LLM-driven Framework for Interpretable Sports Motion Assessment and Training Guidance
arXiv:2512.14121v2 Announce Type: replace-cross Abstract: Existing intelligent sports analysis systems mainly focus on “scoring and visualization,” often lacking automatic performance diagnosis and interpretable training guidance. Recent advances in Large Language Models (LLMs) and motion analysis techniques provide new opportunities to address the above limitations. In this paper, we propose SportsGPT, an LLM-driven framework for interpretable […]