arXiv:2602.22545v2 Announce Type: replace-cross Abstract: Tau positron emission tomography (tau-PET) is an important in vivo biomarker of Alzheimer’s disease, but its cost, limited availability, and acquisition burden restrict broad clinical use. This work proposes an interpretable multimodal image synthesis framework for generating tau-PET from paired T1-weighted and FLAIR MRI. The proposed model combines a Partial […]
MCLR: Improving Conditional Modeling via Inter-Class Likelihood-Ratio Maximization and Unifying Classifier-Free Guidance with Alignment Objectives
arXiv:2603.22364v2 Announce Type: replace-cross Abstract: Diffusion models have achieved state-of-the-art performance in generative modeling, but their success often relies heavily on classifier-free guidance (CFG), an inference-time heuristic that modifies the sampling trajectory. From a theoretical perspective, diffusion models trained with standard denoising score matching (DSM) are expected to recover the target data distribution, raising the […]
Small Vision-Language Models are Smart Compressors for Long Video Understanding
arXiv:2604.08120v1 Announce Type: cross Abstract: Adapting Multimodal Large Language Models (MLLMs) for hour-long videos is bottlenecked by context limits. Dense visual streams saturate token budgets and exacerbate the lost-in-the-middle phenomenon. Existing heuristics, like sparse sampling or uniform pooling, blindly sacrifice fidelity by discarding decisive moments and wasting bandwidth on irrelevant backgrounds. We propose Tempo, an […]
AT-ADD: All-Type Audio Deepfake Detection Challenge Evaluation Plan
arXiv:2604.08184v1 Announce Type: cross Abstract: The rapid advancement of Audio Large Language Models (ALLMs) has enabled cost-effective, high-fidelity generation and manipulation of both speech and non-speech audio, including sound effects, singing voices, and music. While these capabilities foster creativity and content production, they also introduce significant security and trust challenges, as realistic audio deepfakes can […]
CIAO – Code In Architecture Out – Automated Software Architecture Documentation with Large Language Models
arXiv:2604.08293v1 Announce Type: cross Abstract: Software architecture documentation is essential for system comprehension, yet it is often unavailable or incomplete. While recent LLM-based techniques can generate documentation from code, they typically address local artifacts rather than producing coherent, system-level architectural descriptions. This paper presents a structured process for automatically generating system-level architectural documentation directly from […]
Lost in the Hype: Revealing and Dissecting the Performance Degradation of Medical Multimodal Large Language Models in Image Classification
arXiv:2604.08333v1 Announce Type: cross Abstract: The rise of multimodal large language models (MLLMs) has sparked an unprecedented wave of applications in the field of medical imaging analysis. However, as one of the earliest and most fundamental tasks integrated into this paradigm, medical image classification reveals a sobering reality: state-of-the-art medical MLLMs consistently underperform compared to […]
Phantasia: Context-Adaptive Backdoors in Vision Language Models
arXiv:2604.08395v1 Announce Type: cross Abstract: Recent advances in Vision-Language Models (VLMs) have greatly enhanced the integration of visual perception and linguistic reasoning, driving rapid progress in multimodal understanding. Despite these achievements, the security of VLMs, particularly their vulnerability to backdoor attacks, remains significantly underexplored. Existing backdoor attacks on VLMs are still in an early stage […]
Trust the AI, Doubt Yourself: The Effect of Urgency on Self-Confidence in Human-AI Interaction
arXiv:2604.07535v1 Announce Type: new Abstract: Studies show that interactions with an AI system fosters trust in human users towards AI. An often overlooked element of such interaction dynamics is the (sense of) urgency when the human user is prompted by an AI agent, e.g., for advice or guidance. In this paper, we show that although […]
Hybrid CNN-Transformer Architecture for Arabic Speech Emotion Recognition
arXiv:2604.07357v1 Announce Type: cross Abstract: Recognizing emotions from speech using machine learning has become an active research area due to its importance in building human-centered applications. However, while many studies have been conducted in English, German, and other European and Asian languages, research in Arabic remains scarce because of the limited availability of annotated datasets. […]
Are we still able to recognize pearls? Machine-driven peer review and the risk to creativity: An explainable RAG-XAI detection framework with markers extraction
arXiv:2604.07964v1 Announce Type: new Abstract: The integration of large language models (LLMs) into peer review raises a concern beyond authorship and detection: the potential cascading automation of the entire editorial process. As reviews become partially or fully machine-generated, it becomes plausible that editorial decisions may also be delegated to algorithmic systems, leading to a fully […]
Wiring the ‘Why’: A Unified Taxonomy and Survey of Abductive Reasoning in LLMs
arXiv:2604.08016v1 Announce Type: new Abstract: Regardless of its foundational role in human discovery and sense-making, abductive reasoning–the inference of the most plausible explanation for an observation–has been relatively underexplored in Large Language Models (LLMs). Despite the rapid advancement of LLMs, the exploration of abductive reasoning and its diverse facets has thus far been disjointed rather […]
An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks
arXiv:2604.07883v1 Announce Type: new Abstract: History textbooks often contain implicit biases, nationalist framing, and selective omissions that are difficult to audit at scale. We propose an agentic evaluation architecture comprising a multimodal screening agent, a heterogeneous jury of five evaluative agents, and a meta-agent for verdict synthesis and human escalation. A central contribution is a […]