Early Quantization Shrinks Codebook: A Simple Fix for Diversity-Preserving Tokenization

arXiv:2603.17052v1 Announce Type: cross Abstract: Vector quantization is a technique in machine learning that discretizes continuous representations into a set of discrete vectors. It is widely employed in tokenizing data representations for large language models, diffusion models, and other generative models. Despite its prevalence, the characteristics and behaviors of vector quantization in generative models remain […]

REAL: Regression-Aware Reinforcement Learning for LLM-as-a-Judge

arXiv:2603.17145v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed as automated evaluators that assign numeric scores to model outputs, a paradigm known as LLM-as-a-Judge. However, standard Reinforcement Learning (RL) methods typically rely on binary rewards (e.g., 0-1 accuracy), thereby ignoring the ordinal structure inherent in regression tasks; for instance, they fail to […]

Generalist Multimodal LLMs Gain Biometric Expertise via Human Salience

arXiv:2603.17173v1 Announce Type: cross Abstract: Iris presentation attack detection (PAD) is critical for secure biometric deployments, yet developing specialized models faces significant practical barriers: collecting data representing future unknown attacks is impossible, and collecting diverse-enough data, yet still limited in terms of its predictive power, is expensive. Additionally, sharing biometric data raises privacy concerns. Due […]

Catching rationalization in the act: detecting motivated reasoning before and after CoT via activation probing

arXiv:2603.17199v1 Announce Type: cross Abstract: Large language models (LLMs) can produce chains of thought (CoT) that do not accurately reflect the actual factors driving their answers. In multiple-choice settings with an injected hint favoring a particular option, models may shift their final answer toward the hinted option and produce a CoT that rationalizes the response […]

Anonymous-by-Construction: An LLM-Driven Framework for Privacy-Preserving Text

arXiv:2603.17217v1 Announce Type: cross Abstract: Responsible use of AI demands that we protect sensitive information without undermining the usefulness of data, an imperative that has become acute in the age of large language models. We address this challenge with an on-premise, LLM-driven substitution pipeline that anonymizes text by replacing personally identifiable information (PII) with realistic, […]

From Drop-off to Recovery: A Mechanistic Analysis of Segmentation in MLLMs

arXiv:2603.17228v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) are increasingly applied to pixel-level vision tasks, yet their intrinsic capacity for spatial understanding remains poorly understood. We investigate segmentation capacity through a layerwise linear probing evaluation across the entire MLLM pipeline: vision encoder, adapter, and LLM. We further conduct an intervention based attention knockout […]

Pathology-Aware Multi-View Contrastive Learning for Patient-Independent ECG Reconstruction

arXiv:2603.17248v1 Announce Type: cross Abstract: Reconstructing a 12-lead electrocardiogram (ECG) from a reduced lead set is an ill-posed inverse problem due to anatomical variability. Standard deep learning methods often ignore underlying cardiac pathology losing vital morphology in precordial leads. We propose Pathology-Aware Multi-View Contrastive Learning, a framework that regularizes the latent space through a pathological […]

From Words to Worlds: Benchmarking Cross-Cultural Cultural Understanding in Machine Translation

arXiv:2603.17303v1 Announce Type: cross Abstract: Culture-expressions, such as idioms, slang, and culture-specific items (CSIs), are pervasive in natural language and encode meanings that go beyond literal linguistic form. Accurately translating such expressions remains challenging for machine translation systems. Despite this, existing benchmarks remain fragmented and do not provide a systematic framework for evaluating translation performance […]

Recurrent Reasoning with Vision-Language Models for Estimating Long-Horizon Embodied Task Progress

arXiv:2603.17312v1 Announce Type: cross Abstract: Accurately estimating task progress is critical for embodied agents to plan and execute long-horizon, multi-step tasks. Despite promising advances, existing Vision-Language Models (VLMs) based methods primarily leverage their video understanding capabilities, while neglecting their complex reasoning potential. Furthermore, processing long video trajectories with VLMs is computationally prohibitive for real-world deployment. […]

Understanding and Defending VLM Jailbreaks via Jailbreak-Related Representation Shift

arXiv:2603.17372v1 Announce Type: cross Abstract: Large vision-language models (VLMs) often exhibit weakened safety alignment with the integration of the visual modality. Even when text prompts contain explicit harmful intent, adding an image can substantially increase jailbreak success rates. In this paper, we observe that VLMs can clearly distinguish benign inputs from harmful ones in their […]

Transformers are Bayesian Networks

arXiv:2603.17063v1 Announce Type: new Abstract: Transformers are the dominant architecture in AI, yet why they work remains poorly understood. This paper offers a precise answer: a transformer is a Bayesian network. We establish this in five ways. First, we prove that every sigmoid transformer with any weights implements weighted loopy belief propagation on its implicit […]

Real-World AI Evaluation: How FRAME Generates Systematic Evidence to Resolve the Decision-Maker’s Dilemma

arXiv:2603.13294v3 Announce Type: replace-cross Abstract: The rapid expansion of AI deployments has put organizational leaders in a decision maker’s dilemma: they must govern these technologies without systematic evidence of how systems behave in their own environments. Predominant evaluation methods generate scalable, abstract measures of model capabilities but smooth over the heterogeneity of real world use, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844