Uncategorized – Page 261 – dijee Pharma Intelligence

Select Smarter, Not More: Prompt-Aware Evaluation Scheduling with Submodular Guarantees

arXiv:2604.11328v1 Announce Type: new Abstract: Automatic prompt optimization (APO) hinges on the quality of its evaluation signal, yet scoring every prompt candidate on the full training set is prohibitively expensive. Existing methods either fix a single evaluation subset before optimization begins (principled but prompt-agnostic) or adapt it heuristically during optimization (flexible but unstable and lacking […]

April 14, 2026

Inside the Scaffold: A Source-Code Taxonomy of Coding Agent Architectures

arXiv:2604.03515v2 Announce Type: replace-cross Abstract: LLM-based coding agents can localize bugs, generate patches, and run tests with diminishing human oversight, yet the scaffolding code that surrounds the language model (the control loop, tool definitions, state management, and context strategy) remains poorly understood. Existing surveys classify agents by abstract capabilities (tool use, planning, reflection) that cannot […]

April 14, 2026

Escaping the Context Bottleneck: Active Context Curation for LLM Agents via Reinforcement Learning

arXiv:2604.11462v1 Announce Type: new Abstract: Large Language Models (LLMs) struggle with long-horizon tasks due to the “context bottleneck” and the “lost-in-the-middle” phenomenon, where accumulated noise from verbose environments degrades reasoning over multi-turn interactions. To address this issue, we introduce a symbiotic framework that decouples context management from task execution. Our architecture pairs a lightweight, specialized […]

April 14, 2026

Tail-Aware Information-Theoretic Generalization for RLHF and SGLD

arXiv:2604.10727v1 Announce Type: cross Abstract: Classical information-theoretic generalization bounds typically control the generalization gap through KL-based mutual information and therefore rely on boundedness or sub-Gaussian tails via the moment generating function (MGF). In many modern pipelines, such as robust learning, RLHF, and stochastic optimization, losses and rewards can be heavy-tailed, and MGFs may not exist, […]

April 14, 2026

PAC-BENCH: Evaluating Multi-Agent Collaboration under Privacy Constraints

arXiv:2604.11523v1 Announce Type: new Abstract: We are entering an era in which individuals and organizations increasingly deploy dedicated AI agents that interact and collaborate with other agents. However, the dynamics of multi-agent collaboration under privacy constraints remain poorly understood. In this work, we present $PACtext-Bench$, a benchmark for systematic evaluation of multi-agent collaboration under privacy […]

April 14, 2026

Zero-Shot Quantization via Weight-Space Arithmetic

arXiv:2604.03420v2 Announce Type: replace-cross Abstract: We show that robustness to post-training quantization (PTQ) is a transferable direction in weight space. We call this direction the quantization vector: extracted from a donor task by simple weight-space arithmetic, it can be used to patch a receiver model and improve post-PTQ Top-1 accuracy by up to 60 points […]

April 14, 2026

Turning Generators into Retrievers: Unlocking MLLMs for Natural Language-Guided Geo-Localization

arXiv:2604.10721v1 Announce Type: cross Abstract: Natural-language Guided Cross-view Geo-localization (NGCG) aims to retrieve geo-tagged satellite imagery using textual descriptions of ground scenes. While recent NGCG methods commonly rely on CLIP-style dual-encoder architectures, they often suffer from weak cross-modal generalization and require complex architectural designs. In contrast, Multimodal Large Language Models (MLLMs) offer powerful semantic reasoning […]

April 14, 2026

Thinking Fast, Thinking Wrong: Intuitiveness Modulates LLM Counterfactual Reasoning in Policy Evaluation

arXiv:2604.10511v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for causal and counterfactual reasoning, yet their reliability in real-world policy evaluation remains underexplored. We construct a benchmark of 40 empirical policy evaluation cases drawn from economics and social science, each grounded in peer-reviewed evidence and classified by intuitiveness — whether the empirical […]

April 14, 2026

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

arXiv:2604.03147v2 Announce Type: replace-cross Abstract: We present a method to identify a valence-arousal (VA) subspace within large language model representations. From 211k emotion-labeled texts, we derive emotion steering vectors, then learn VA axes as linear combinations of their top PCA components via ridge regression on the model’s self-reported valence-arousal scores. The resulting VA subspace exhibits […]

April 14, 2026

Failure Ontology: A Lifelong Learning Framework for Blind Spot Detection and Resilience Design

arXiv:2604.10549v1 Announce Type: new Abstract: Personalized learning systems are almost universally designed around a single objective: help people acquire knowledge and skills more efficiently. We argue this framing misses the more consequential problem. The most damaging failures in human life-financial ruin, health collapse, professional obsolescence-are rarely caused by insufficient knowledge acquisition. They arise from the […]

April 14, 2026

Detecting RAG Extraction Attack via Dual-Path Runtime Integrity Game

arXiv:2604.10717v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) systems augment large language models with external knowledge, yet introduce a critical security vulnerability: RAG Knowledge Base Leakage, wherein adversarial prompts can induce the model to divulge retrieved proprietary content. Recent studies reveal that such leakage can be executed through adaptive and iterative attack strategies (named RAG […]

April 14, 2026

Enhancing Cross-Problem Vehicle Routing via Federated Learning

arXiv:2604.10652v1 Announce Type: new Abstract: Vehicle routing problems (VRPs) constitute a core optimization challenge in modern logistics and supply chain management. The recent neural combinatorial optimization (NCO) has demonstrated superior efficiency over some traditional algorithms. While serving as a primary NCO approach for solving general VRPs, current cross-problem learning paradigms are still subject to performance […]

April 14, 2026

Subscribe for Updates