April 23, 2026 – Page 10 – dijee Pharma Intelligence

Large Language Models Meet Biomedical Knowledge Graphs for Mechanistically Grounded Therapeutic Prioritization

arXiv:2604.19815v1 Announce Type: new Abstract: Drug repurposing is often framed as a candidate identification task, but existing approaches provide limited guidance for distinguishing biologically plausible candidates from historically well-connected ones. Here we introduce DrugKLM, a hybrid framework that integrates biomedical knowledge graph structure with large language model-based mechanistic reasoning to enable mechanistically grounded therapeutic prioritization. […]

April 23, 2026

LLM Agents Predict Social Media Reactions but Do Not Outperform Text Classifiers: Benchmarking Simulation Accuracy Using 120K+ Personas of 1511 Humans

arXiv:2604.19787v1 Announce Type: cross Abstract: Social media platforms mediate how billions form opinions and engage with public discourse. As autonomous AI agents increasingly participate in these spaces, understanding their behavioral fidelity becomes critical for platform governance and democratic resilience. Previous work demonstrates that LLM-powered agents can replicate aggregate survey responses, yet few studies test whether […]

April 23, 2026

Exploiting LLM-as-a-Judge Disposition on Free Text Legal QA via Prompt Optimization

arXiv:2604.20726v1 Announce Type: cross Abstract: This work explores the role of prompt design and judge selection in LLM-as-a-Judge evaluations of free text legal question answering. We examine whether automatic task prompt optimization improves over human-centered design, whether optimization effectiveness varies by judge feedback style, and whether optimized prompts transfer across judges. We systematically address these […]

April 23, 2026

Emergence Transformer: Dynamical Temporal Attention Matters

arXiv:2604.19816v1 Announce Type: new Abstract: The Transformer, a breakthrough architecture in artificial intelligence, owes its success to the attention mechanism, which utilizes long-range interactions in sequential data, enabling the emergent coherence between large language models (LLMs) and data distributions. However, temporal attention, that is, different forms of long-range interactions in temporal sequences, has rarely been […]

April 23, 2026

Utterance-Level Methods for Identifying Reliable ASR-Output for Child Speech

arXiv:2604.19801v1 Announce Type: cross Abstract: Automatic Speech Recognition (ASR) is increasingly used in applications involving child speech, such as language learning and literacy acquisition. However, the effectiveness of such applications is limited by high ASR error rates. The negative effects can be mitigated by identifying in advance which ASR-outputs are reliable. This work aims to […]

April 23, 2026

OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model

arXiv:2604.20806v1 Announce Type: cross Abstract: Large vision-language models (LVLMs) have made substantial advances in reasoning tasks at the Olympiad level. Nevertheless, current Olympiad-level multimodal reasoning benchmarks for these models often emphasize single-image analysis and fail to exploit contextual information across multiple images. We present OMIBench, a benchmark designed to evaluate Olympiad-level reasoning when the required […]

April 23, 2026

Model Capability Assessment and Safeguards for Biological Weaponization

arXiv:2604.19811v1 Announce Type: cross Abstract: AI leaders and safety reports increasingly warn that advances in model reasoning may enable biological misuse, including by low-expertise users, while major labs describe safeguards as expanding but still evolving rather than settled. This study benchmarks ChatGPT 5.2 Auto, Gemini 3 Pro Thinking, Claude Opus 4.5 and Meta’s Muse Spark […]

April 23, 2026

JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents

arXiv:2604.19821v1 Announce Type: new Abstract: Large language model (LLM) agents augmented with external tools often struggle as number of tools grow large and become domain-specific. In such settings, ambiguous tool descriptions and under-specified agent instructions frequently lead to tool mis-selection and incorrect slot/value instantiation. We hypothesize that this is due to two root causes: generic, […]

April 23, 2026

SolidCoder: Bridging the Mental-Reality Gap in LLM Code Generation through Concrete Execution

arXiv:2604.19825v1 Announce Type: cross Abstract: State-of-the-art code generation frameworks rely on mental simulation, where LLMs internally trace execution to verify correctness. We expose a fundamental limitation: the Mental-Reality Gap — where models hallucinate execution traces and confidently validate buggy code. This gap manifests along two orthogonal dimensions: the Specification Gap (overlooking edge cases during planning) […]

April 23, 2026

A Survey of Scaling in Large Language Model Reasoning

arXiv:2504.02181v2 Announce Type: replace Abstract: The rapid advancements in large Language models (LLMs) have significantly enhanced their reasoning capabilities, driven by various strategies such as multi-agent collaboration. However, unlike the well-established performance improvements achieved through scaling data and model size, the scaling of reasoning in LLMs is more complex and can even negatively impact reasoning […]

April 23, 2026

More Is Different: Toward a Theory of Emergence in AI-Native Software Ecosystems

arXiv:2604.19827v1 Announce Type: cross Abstract: Software engineering faces a fundamental challenge: multi-agent AI systems fail in ways that defy explanation by traditional theories. While individual agents perform correctly, their interactions degrade entire ecosystems, revealing a gap in our understanding of software evolution. This paper argues that AI-native software ecosystems must be studied as complex adaptive […]

April 23, 2026

Forage V2: Knowledge Evolution and Transfer in Autonomous Agent Organizations

arXiv:2604.19837v1 Announce Type: new Abstract: Autonomous agents operating in open-world tasks — where the completion boundary is not given in advance — face denominator blindness: they systematically underestimate the scope of the target space. Forage V1 addressed this through co-evolving evaluation (an independent Evaluator discovers what “complete” means) and method isolation (Evaluator and Planner cannot […]

April 23, 2026

Subscribe for Updates