What’s in a name? Moderna’s “vaccine” vs. “therapy” dilemma

Is it the Department of Defense or the Department of War? The Gulf of Mexico or the Gulf of America? A vaccine—or an “individualized neoantigen

TR-EduVSum: A Turkish-Focused Dataset and Consensus Framework for Educational Video Summarization

arXiv:2604.07553v1 Announce Type: cross Abstract: This study presents a framework for generating the gold-standard summary fully automatically and reproducibly based on multiple human summaries of

Google, AI Literacy, and the Learning Sciences: Multiple Modes of Research, Industry, and Practice Partnerships

arXiv:2604.07601v1 Announce Type: cross Abstract: Enabling AI literacy in the general population at scale is a complex challenge requiring multiple stakeholders and institutions collaborating together.

GIRL: Generative Imagination Reinforcement Learning via Information-Theoretic Hallucination Control

arXiv:2604.07426v1 Announce Type: cross Abstract: Model-based reinforcement learning (MBRL) improves sample efficiency by optimizing policies inside imagined rollouts, but long-horizon planning degrades when model errors

TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation

arXiv:2604.07894v1 Announce Type: cross Abstract: Personalized large language models (PLLMs) have garnered significant attention for their ability to align outputs with individual’s needs and preferences.

Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration

April 7, 2026

arXiv:2602.03151v2 Announce Type: replace
Abstract: Vision Language Model (VLM) typically assume complete modality input during inference. However, their effectiveness drops sharply when certain modalities are unavailable or incomplete. Current research on missing modality primarily faces two dilemmas: Prompt-based methods struggle to restore missing yet indispensable features and degrade the generalizability of VLM. Imputation-based approaches, lacking effective guidance, are prone to generating semantically irrelevant noise. Restoring precise semantics while sustaining VLM’s generalization remains challenging. Therefore, we propose a general missing modality restoration strategy in this paper. We introduce an enhanced diffusion model as a pluggable mid-stage training module to effectively restore missing features. Our strategy introduces two key innovations: (I) Dynamic Modality Gating, which adaptively leverages conditional features to guide the generation of semantically consistent features; (II) Cross-Modal Mutual Learning mechanism, which bridges the semantic spaces of the dual models to achieve bi-directional alignment. Notably, our strategy maintains the original integrity of the pre-trained VLM, requiring no fine-tuning of the backbone models while significantly boosting resilience to information loss. Zero-shot evaluations across benchmark datasets demonstrate that our approach consistently outperforms existing baselines, establishing it as a robust and scalable extension that ensures VLM reliability across diverse missing rates and conditions. Our code and models will be publicly available.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844