arXiv:2603.06854v1 Announce Type: cross
Abstract: Multimodal large language models can exhibit text dominance, over-relying on linguistic priors instead of grounding predictions in non-text inputs. One example is large audio-language models (LALMs) where decisive audio evidence can be under-utilized even when it contains important information. To address this issue we use mechanistic interpretability to identify a small set of audio-specialist attention heads whose audio attention yields a “listening” signal. We show that this signal increases when audio evidence affects the model’s output, providing an indicator of audio engagement under standard prompting. Leveraging this localization, we construct an audio–silence steering direction and apply an inference-time activation intervention to the final representation, amplifying the model’s audio effect. To demonstrate the utility of this intervention, we show on MMAU that this improves accuracy by up to +8.0 percentage points on two Qwen-based LALMs, without any parameter updates.
BadLLM-TG: A Backdoor Defender powered by LLM Trigger Generator
arXiv:2603.15692v1 Announce Type: cross Abstract: Backdoor attacks compromise model reliability by using triggers to manipulate outputs. Trigger inversion can accurately locate these triggers via a

