May 27, 2026 – Page 6 – dijee Pharma Intelligence

Can Broad Biomedical Knowledge be Contextualized into Scenario-Grounded Propositions?

arXiv:2605.27082v1 Announce Type: new Abstract: Biomedical discovery often requires connecting broad biomedical knowledge with specific experimental or clinical data. Background knowledge suggests relevant mechanisms but is usually too general to map directly onto dataset variables, while data-driven patterns can be dataset-specific and hard to interpret mechanistically. We study this missing link as knowledge contextualization: transforming […]

May 27, 2026

Scaling, Benchmarking, and Reasoning of Vision-Language Agents for Mobile GUI Navigation

arXiv:2605.27134v1 Announce Type: new Abstract: Vision-Language Models (VLMs) have shown rapid progress in mobile GUI navigation. This paper presents a systematic study of data scaling, benchmarking, and reasoning for VLM-based agents in this domain. To facilitate rigorous evaluation, we introduce HyperTrack, a large-scale dataset with over 16000 real-world tasks across more than 650 Chinese mobile […]

May 27, 2026

Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent

arXiv:2605.27078v1 Announce Type: cross Abstract: Training loss and accuracy are the standard signals used to monitor generalization during deep neural network training. Two well-documented phenomena complicate this picture: in grokking, train loss falls rapidly while test performance improves abruptly only after a long delay; in epoch-wise double descent, train loss decreases monotonically while test loss […]

May 27, 2026

Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs

arXiv:2605.27157v1 Announce Type: new Abstract: Retrieval-augmented LLMs are deployed for tasks where evidence quality determines action safety, yet evaluation protocols assume that single-turn robustness predicts robustness when evidence accumulates across turns. We show this assumption is fundamentally incorrect. Models exhibit a monitoring-control gap: they readily acknowledge contradictory evidence, yet this awareness fails to constrain their […]

May 27, 2026

AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning

arXiv:2605.18592v2 Announce Type: replace-cross Abstract: Rubric-based reward shaping provides interpretable and editable reward signals for fine-tuning LLMs via reinforcement learning (RL), but existing adaptive rubric methods typically update criteria from local evidence such as the current batch or instance-level comparisons. This local view discards diagnostic information produced during training, making it difficult to track recurring […]

May 27, 2026

Gumbel Machine: Counterfactual Student Writing Generation via Gumbel Noise Steering

arXiv:2605.27249v1 Announce Type: new Abstract: An effective method of teaching across disciplines is to provide examples of high-quality work. However, an example may be significantly different from a student’s current work, making it challenging for them to emulate. An ideal learning demonstration is a counterfactual version of the student work, an improved version that is […]

May 27, 2026

E3: Issue-Level Backtesting for Automated Research Critique

arXiv:2605.27072v1 Announce Type: cross Abstract: We present E3, an automated review assistant that augments reviewers and engineering teams by identifying decision-relevant technical concerns in research papers. For each concern, E3 reports its nature, its location, its bearing on the contribution, and the analysis or evidence that would resolve it, covering unsupported claims, missing ablations, weak […]

May 27, 2026

2-ASP(Q) programs with weak constraints: Complexity and efficient implementation

arXiv:2605.27338v1 Announce Type: new Abstract: ASP(Q) extends Answer Set Programming (ASP) with Quantifiers over answer sets. In this paper we focus on the class of ASP(Q) programs with two quantifiers and weak constraints, denoted as 2-ASP(Q)^w. 2-ASP(Q)^w is a practically relevant fragment of ASP(Q) that is expressive enough to capture optimization problems up to the […]

May 27, 2026

Echoes in Filter Bubble: Diagnosing and Curing Popularity Bias in Generative Recommenders

arXiv:2605.16825v2 Announce Type: replace-cross Abstract: Recently, Generative Recommenders (GRs), characterized by a unified end-to-end framework, have exhibited astonishing potential in transforming the recommendation paradigm. Despite their effectiveness, we recognize that GRs are still susceptible to the long-standing issue of popularity bias that has pervaded the recommendation community. Although a few studies have attempted to extend […]

May 27, 2026

Xe-Forge: Multi-Stage LLM-Powered Kernel Optimization for Intel GPU

arXiv:2605.26118v1 Announce Type: cross Abstract: Porting deep learning algorithms to new hardware accelerators requires developers to repeatedly apply the same low-level optimizations — quantization, memory access coalescing, tile size tuning, and architecture-specific workarounds — to every Triton kernel in their code-base. This manual, repetitive effort is a major bottleneck: each kernel demands the same cycle […]

May 27, 2026

QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents

arXiv:2605.27068v1 Announce Type: cross Abstract: Social deduction games have become a popular testbed for probing reasoning, deception, coordination, and belief modeling in Large Language Model (LLM) agents. However, most environments are scored only by game outcomes such as win rates and largely remain to text-only interaction, making it difficult to tell whether an agent’s language […]

May 27, 2026

Eroding Trust in Real Speech: A Large-Scale Study of Human Audio Deepfake Perception

arXiv:2605.26136v1 Announce Type: cross Abstract: Audio deepfakes have improved rapidly recently, yet their effect on human trust in real speech remains unstudied. We present the largest listening study on audio deepfake perception to date, collecting 35,532 judgments from 1,768 participants across 138 text-to-speech and voice conversion systems. Our central finding is a skepticism shift: compared […]

May 27, 2026

Subscribe for Updates