arXiv:2606.12942v1 Announce Type: new
Abstract: Generative listwise ranking with Large Multimodal Models (LMMs) aims to capture global list context in a single forward pass, but
its effectiveness degrades in long-context multimodal scenarios. We identify a recurring failure mode, parse collapse, where the
autoregressive decoder produces fluent yet incomplete rankings by silently omitting candidates and terminating early. This
failure stems from limited context utilization rather than simple formatting mistakes, making prompt engineering and constrained
decoding insufficient. We propose PRISMR (Parameterized Representation Internalization for Semantic Multimodal Ranking), a
framework that replaces transient in-context list processing with parametric structural conditioning. PRISMR uses a lightweight
hypernetwork to encode multimodal candidates in parallel and generate item-specific LoRA weights, which are synthesized into an
instance-specific adapter for a LMM. This paradigm enables more robust internalization of list structure while preserving the
base model. We further introduce a large-scale multimodal review-ranking benchmark for evaluation. Experiments demonstrate that
PRISMR substantially reduces parse collapse, improves listwise ranking performance, and transfers effectively across domains and
instruction-tuned backbones.
Crisis support teams’ technological openness and learning attitudes toward the AI based virtual patient system crisis support VR
BackgroundAgainst the backdrop of escalating global humanitarian crises, innovative didactic simulations are becoming increasingly important. A promising alternative to traditional classroom-based didactics for learning psychological