April 21, 2026 – Page 14 – dijee Pharma Intelligence

Task Matters: Knowledge Requirements Shape LLM Responses to Context-Memory Conflict

arXiv:2506.06485v4 Announce Type: replace-cross Abstract: Large language models (LLMs) draw on both contextual information and parametric memory, yet these sources can conflict. Prior studies have largely examined this issue in contextual question answering, implicitly assuming that tasks should rely on the provided context, leaving unclear how LLMs behave when tasks require different types and degrees […]

April 21, 2026

Reducing Peak Memory Usage for Modern Multimodal Large Language Model Pipelines

arXiv:2604.16734v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) have recently demonstrated strong capabilities in understanding and generating responses from diverse visual inputs, including high-resolution images and long video sequences. As these models scale to richer visual representations, inference increasingly relies on storing large numbers of vision tokens in the key-value (KV) cache, making […]

April 21, 2026

The Query Channel: Information-Theoretic Limits of Masking-Based Explanations

arXiv:2604.16689v1 Announce Type: new Abstract: Masking-based post-hoc explanation methods, such as KernelSHAP and LIME, estimate local feature importance by querying a black-box model under randomized perturbations. This paper formulates this procedure as communication over a query channel, where the latent explanation acts as a message and each masked evaluation is a channel use. Within this […]

April 21, 2026

Hierarchical Vision Transformer Enhanced by Graph Convolutional Network for Image Classification

arXiv:2604.16823v1 Announce Type: cross Abstract: Vision Transformer (ViT) has brought new breakthroughs to the field of image classification by introducing the self-attention mechanism and Graph Convolutional Networks(GCN) have been proposed and successfully applied in data representation and analysis. However, there are key challenges which limit their further development: (1) The patch size selected by ViT […]

April 21, 2026

VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision

arXiv:2510.27462v2 Announce Type: replace-cross Abstract: Supervised fine-tuning (SFT) on long chain-of-thought (CoT) trajectories has emerged as a crucial technique for enhancing the reasoning abilities of large language models (LLMs). However, the standard cross-entropy loss treats all tokens equally, ignoring their heterogeneous contributions across a reasoning trajectory. This uniform treatment leads to misallocated supervision and weak […]

April 21, 2026

CoGR-MoE: Concept-Guided Expert Routing with Consistent Selection and Flexible Reasoning for Visual Question Answering

arXiv:2604.16930v1 Announce Type: cross Abstract: Visual Question Answering (VQA) requires models to identify the correct answer options based on both visual and textual evidence. Recent Mixture-of-Experts (MoE) methods improve option reasoning by grouping similar concepts or routing based on examples. However, unstable routing can lead to inconsistent expert selection in the same question type, while […]

April 21, 2026

RankGuide: Tensor-Rank-Guided Routing and Steering for Efficient Reasoning

arXiv:2604.16694v1 Announce Type: new Abstract: Large reasoning models (LRMs) enhance problem-solving capabilities by generating explicit multi-step chains of thought (CoT) reasoning; however, they incur substantial inference latency and computational overhead. To mitigate this issue, recent works have explored model collaboration paradigms, where small reasoning models (SRMs) generate intermediate reasoning steps to achieve a better accuracy–latency […]

April 21, 2026

Beyond Static Benchmarks: Synthesizing Harmful Content via Persona-based Simulation for Robust Evaluation

arXiv:2604.17020v1 Announce Type: cross Abstract: Static benchmarks for harmful content detection face limitations in scalability and diversity, and may also be affected by contamination from web-scale pre-training corpora. To address these issues, we propose a framework for synthesizing harmful content, leveraging persona-guided large language model (LLM) agents. Our approach constructs two-dimensional user personas by integrating […]

April 21, 2026

LVLMs and Humans Ground Differently in Referential Communication

arXiv:2601.19792v3 Announce Type: replace-cross Abstract: For generative AI agents to partner effectively with human users, the ability to accurately predict human intent is critical. But this ability to collaborate remains limited by a critical deficit: an inability to model common ground. We present a referential communication experiment with a factorial design involving director-matcher pairs (human-human, […]

April 21, 2026

The Consensus Trap: Rescuing Multi-Agent LLMs from Adversarial Majorities via Token-Level Collaboration

arXiv:2604.17139v1 Announce Type: cross Abstract: Multi-agent large language model (LLM) architectures increasingly rely on response-level aggregation, such as Majority Voting (MAJ), to raise reasoning ceilings. However, in open environments, agents are highly susceptible to stealthy contextual corruption, such as targeted prompt injections. We reveal a critical structural vulnerability in current multi-agent systems: response-level aggregation collapses […]

April 21, 2026

Evaluating Tool-Using Language Agents: Judge Reliability, Propagation Cascades, and Runtime Mitigation in AgentProp-Bench

arXiv:2604.16706v1 Announce Type: new Abstract: Automated evaluation of tool-using large language model (LLM) agents is widely assumed to be reliable, but this assumption has rarely been validated against human annotation. We introduce AgentProp-Bench, a 2,000-task benchmark with 2,300 traces across four domains, nine production LLMs, and a 100-label human-validated subset. We quantify judge reliability, characterize […]

April 21, 2026

Enhancing Zero-shot Personalized Image Aesthetics Assessment with Profile-aware Multimodal LLM

arXiv:2604.17233v1 Announce Type: cross Abstract: Personalized image aesthetics assessment (PIAA) aims to predict an individual user’s subjective rating of an image, which requires modeling user-specific aesthetic preferences. Existing methods rely on historical user ratings for this modeling and therefore struggle when such data are unavailable. We address this zero-shot setting by using user profiles as […]

April 21, 2026

Subscribe for Updates