Interpretable Tau-PET Synthesis from Multimodal T1-Weighted and FLAIR MRI Using Partial Information Decomposition Guided Disentangled Quantized Half-UNet

arXiv:2602.22545v2 Announce Type: replace-cross Abstract: Tau positron emission tomography (tau-PET) is an important in vivo biomarker of Alzheimer’s disease, but its cost, limited availability, and acquisition burden restrict broad clinical use. This work proposes an interpretable multimodal image synthesis framework for generating tau-PET from paired T1-weighted and FLAIR MRI. The proposed model combines a Partial […]

MCLR: Improving Conditional Modeling via Inter-Class Likelihood-Ratio Maximization and Unifying Classifier-Free Guidance with Alignment Objectives

arXiv:2603.22364v2 Announce Type: replace-cross Abstract: Diffusion models have achieved state-of-the-art performance in generative modeling, but their success often relies heavily on classifier-free guidance (CFG), an inference-time heuristic that modifies the sampling trajectory. From a theoretical perspective, diffusion models trained with standard denoising score matching (DSM) are expected to recover the target data distribution, raising the […]

Small Vision-Language Models are Smart Compressors for Long Video Understanding

arXiv:2604.08120v1 Announce Type: cross Abstract: Adapting Multimodal Large Language Models (MLLMs) for hour-long videos is bottlenecked by context limits. Dense visual streams saturate token budgets and exacerbate the lost-in-the-middle phenomenon. Existing heuristics, like sparse sampling or uniform pooling, blindly sacrifice fidelity by discarding decisive moments and wasting bandwidth on irrelevant backgrounds. We propose Tempo, an […]

AT-ADD: All-Type Audio Deepfake Detection Challenge Evaluation Plan

arXiv:2604.08184v1 Announce Type: cross Abstract: The rapid advancement of Audio Large Language Models (ALLMs) has enabled cost-effective, high-fidelity generation and manipulation of both speech and non-speech audio, including sound effects, singing voices, and music. While these capabilities foster creativity and content production, they also introduce significant security and trust challenges, as realistic audio deepfakes can […]

CIAO – Code In Architecture Out – Automated Software Architecture Documentation with Large Language Models

arXiv:2604.08293v1 Announce Type: cross Abstract: Software architecture documentation is essential for system comprehension, yet it is often unavailable or incomplete. While recent LLM-based techniques can generate documentation from code, they typically address local artifacts rather than producing coherent, system-level architectural descriptions. This paper presents a structured process for automatically generating system-level architectural documentation directly from […]

Lost in the Hype: Revealing and Dissecting the Performance Degradation of Medical Multimodal Large Language Models in Image Classification

arXiv:2604.08333v1 Announce Type: cross Abstract: The rise of multimodal large language models (MLLMs) has sparked an unprecedented wave of applications in the field of medical imaging analysis. However, as one of the earliest and most fundamental tasks integrated into this paradigm, medical image classification reveals a sobering reality: state-of-the-art medical MLLMs consistently underperform compared to […]

Phantasia: Context-Adaptive Backdoors in Vision Language Models

arXiv:2604.08395v1 Announce Type: cross Abstract: Recent advances in Vision-Language Models (VLMs) have greatly enhanced the integration of visual perception and linguistic reasoning, driving rapid progress in multimodal understanding. Despite these achievements, the security of VLMs, particularly their vulnerability to backdoor attacks, remains significantly underexplored. Existing backdoor attacks on VLMs are still in an early stage […]

Hybrid CNN-Transformer Architecture for Arabic Speech Emotion Recognition

arXiv:2604.07357v1 Announce Type: cross Abstract: Recognizing emotions from speech using machine learning has become an active research area due to its importance in building human-centered applications. However, while many studies have been conducted in English, German, and other European and Asian languages, research in Arabic remains scarce because of the limited availability of annotated datasets. […]

Are we still able to recognize pearls? Machine-driven peer review and the risk to creativity: An explainable RAG-XAI detection framework with markers extraction

arXiv:2604.07964v1 Announce Type: new Abstract: The integration of large language models (LLMs) into peer review raises a concern beyond authorship and detection: the potential cascading automation of the entire editorial process. As reviews become partially or fully machine-generated, it becomes plausible that editorial decisions may also be delegated to algorithmic systems, leading to a fully […]

Wiring the ‘Why’: A Unified Taxonomy and Survey of Abductive Reasoning in LLMs

arXiv:2604.08016v1 Announce Type: new Abstract: Regardless of its foundational role in human discovery and sense-making, abductive reasoning–the inference of the most plausible explanation for an observation–has been relatively underexplored in Large Language Models (LLMs). Despite the rapid advancement of LLMs, the exploration of abductive reasoning and its diverse facets has thus far been disjointed rather […]

An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks

arXiv:2604.07883v1 Announce Type: new Abstract: History textbooks often contain implicit biases, nationalist framing, and selective omissions that are difficult to audit at scale. We propose an agentic evaluation architecture comprising a multimodal screening agent, a heterogeneous jury of five evaluative agents, and a meta-agent for verdict synthesis and human escalation. A central contribution is a […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844