ConTrans: Learning Text-enhanced Local-global Temporal Representations for Zero-shot Temporal Action Localization

Target-Side Paraphrase Augmentation for Sign Language Translation with Large Language Models

arXiv:2605.31393v1 Announce Type: cross Abstract: Sign language translation (SLT) remains constrained by limited paired sign-video/text corpora and heavy-tailed target vocabularies. We study target-side augmentation in

Simple Token-Efficient Vision-Language Model for Case-level Pathology Synoptic Report Generation

arXiv:2605.30716v1 Announce Type: cross Abstract: Generating clinically useful pathology reports for pathology cases from whole-slide images (WSIs) is challenging due to gigapixel resolution, long visual-token

No More K-means: Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

arXiv:2605.30120v2 Announce Type: replace-cross Abstract: Multi-vector retrieval (MVR) models, exemplified by ColBERT, have established new benchmarks in retrieval accuracy by preserving fine-grained token-level interactions. However,

Same Patient, Different Words, Different Diagnosis? Evaluating Semantic Stability in Clinical LLMs

arXiv:2605.30646v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly used in clinical applications. However, their behavior remains highly sensitive to subtle linguistic variations,

EUDAIMONIA: Evaluating Undesirable Dynamics in AI

arXiv:2605.30654v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used as conversational partners for companionship, emotional disclosure, and interpersonal advice, but the social