BadLLM-TG: A Backdoor Defender powered by LLM Trigger Generator

arXiv:2603.15692v1 Announce Type: cross Abstract: Backdoor attacks compromise model reliability by using triggers to manipulate outputs. Trigger inversion can accurately locate these triggers via a

A Framework and Prototype for a Navigable Map of Datasets in Engineering Design and Systems Engineering

arXiv:2603.15722v1 Announce Type: cross Abstract: The proliferation of data across the system lifecycle presents both a significant opportunity and a challenge for Engineering Design and

Generative AI for Quantum Circuits and Quantum Code: A Technical Review and Taxonomy

arXiv:2603.16216v1 Announce Type: cross Abstract: We review thirteen generative systems and five supporting datasets for quantum circuit and quantum code generation, identified through a structured

Residual Stream Duality in Modern Transformer Architectures

arXiv:2603.16039v1 Announce Type: cross Abstract: Recent work has made clear that the residual pathway is not mere optimization plumbing; it is part of the model’s

PathGLS: Evaluating Pathology Vision-Language Models without Ground Truth through Multi-Dimensional Consistency

arXiv:2603.16113v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) offer significant potential in computational pathology by enabling interpretable image analysis, automated reporting, and scalable decision support.

AQuA: Toward Strategic Response Generation for Ambiguous Visual Questions

March 10, 2026

arXiv:2603.07394v1 Announce Type: cross
Abstract: Visual Question Answering (VQA) is a core task for evaluating the capabilities of Vision-Language Models (VLMs). Existing VQA benchmarks primarily feature clear and unambiguous image-question pairs, whereas real-world scenarios often involve varying degrees of ambiguity that require nuanced reasoning and context-appropriate response strategies. Although recent studies have begun to address ambiguity in VQA, they lack (1) a systematic categorization of ambiguity levels and (2) datasets and models that support strategy-aware responses. In this paper, we introduce Ambiguous Visual Question Answering (AQuA), a fine-grained dataset that classifies ambiguous VQA instances into four levels according to the nature and degree of ambiguity, along with the optimal response strategy for each case. Our evaluation of diverse open-source and proprietary VLMs shows that most models fail to adapt their strategy to the ambiguity type, frequently producing overconfident answers rather than seeking clarification or acknowledging uncertainty. To address this challenge, we fine-tune VLMs on AQuA, enabling them to adaptively choose among multiple response strategies, such as directly answering, inferring intent from contextual cues, listing plausible alternatives, or requesting clarification. VLMs trained on AQuA achieve strategic response generation for ambiguous VQA, demonstrating the ability to recognize ambiguity, manage uncertainty, and respond with context-appropriate strategies, while outperforming both open-source and closed-source baselines.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844