Uncategorized – Page 262 – dijee Pharma Intelligence

Escaping the Context Bottleneck: Active Context Curation for LLM Agents via Reinforcement Learning

arXiv:2604.11462v1 Announce Type: new Abstract: Large Language Models (LLMs) struggle with long-horizon tasks due to the “context bottleneck” and the “lost-in-the-middle” phenomenon, where accumulated noise from verbose environments degrades reasoning over multi-turn interactions. To address this issue, we introduce a symbiotic framework that decouples context management from task execution. Our architecture pairs a lightweight, specialized […]

April 14, 2026

Tail-Aware Information-Theoretic Generalization for RLHF and SGLD

arXiv:2604.10727v1 Announce Type: cross Abstract: Classical information-theoretic generalization bounds typically control the generalization gap through KL-based mutual information and therefore rely on boundedness or sub-Gaussian tails via the moment generating function (MGF). In many modern pipelines, such as robust learning, RLHF, and stochastic optimization, losses and rewards can be heavy-tailed, and MGFs may not exist, […]

April 14, 2026

PAC-BENCH: Evaluating Multi-Agent Collaboration under Privacy Constraints

arXiv:2604.11523v1 Announce Type: new Abstract: We are entering an era in which individuals and organizations increasingly deploy dedicated AI agents that interact and collaborate with other agents. However, the dynamics of multi-agent collaboration under privacy constraints remain poorly understood. In this work, we present $PACtext-Bench$, a benchmark for systematic evaluation of multi-agent collaboration under privacy […]

April 14, 2026

Zero-Shot Quantization via Weight-Space Arithmetic

arXiv:2604.03420v2 Announce Type: replace-cross Abstract: We show that robustness to post-training quantization (PTQ) is a transferable direction in weight space. We call this direction the quantization vector: extracted from a donor task by simple weight-space arithmetic, it can be used to patch a receiver model and improve post-PTQ Top-1 accuracy by up to 60 points […]

April 14, 2026

Turning Generators into Retrievers: Unlocking MLLMs for Natural Language-Guided Geo-Localization

arXiv:2604.10721v1 Announce Type: cross Abstract: Natural-language Guided Cross-view Geo-localization (NGCG) aims to retrieve geo-tagged satellite imagery using textual descriptions of ground scenes. While recent NGCG methods commonly rely on CLIP-style dual-encoder architectures, they often suffer from weak cross-modal generalization and require complex architectural designs. In contrast, Multimodal Large Language Models (MLLMs) offer powerful semantic reasoning […]

April 14, 2026

Thinking Fast, Thinking Wrong: Intuitiveness Modulates LLM Counterfactual Reasoning in Policy Evaluation

arXiv:2604.10511v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for causal and counterfactual reasoning, yet their reliability in real-world policy evaluation remains underexplored. We construct a benchmark of 40 empirical policy evaluation cases drawn from economics and social science, each grounded in peer-reviewed evidence and classified by intuitiveness — whether the empirical […]

April 14, 2026

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

arXiv:2604.03147v2 Announce Type: replace-cross Abstract: We present a method to identify a valence-arousal (VA) subspace within large language model representations. From 211k emotion-labeled texts, we derive emotion steering vectors, then learn VA axes as linear combinations of their top PCA components via ridge regression on the model’s self-reported valence-arousal scores. The resulting VA subspace exhibits […]

April 14, 2026

Failure Ontology: A Lifelong Learning Framework for Blind Spot Detection and Resilience Design

arXiv:2604.10549v1 Announce Type: new Abstract: Personalized learning systems are almost universally designed around a single objective: help people acquire knowledge and skills more efficiently. We argue this framing misses the more consequential problem. The most damaging failures in human life-financial ruin, health collapse, professional obsolescence-are rarely caused by insufficient knowledge acquisition. They arise from the […]

April 14, 2026

Detecting RAG Extraction Attack via Dual-Path Runtime Integrity Game

arXiv:2604.10717v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) systems augment large language models with external knowledge, yet introduce a critical security vulnerability: RAG Knowledge Base Leakage, wherein adversarial prompts can induce the model to divulge retrieved proprietary content. Recent studies reveal that such leakage can be executed through adaptive and iterative attack strategies (named RAG […]

April 14, 2026

Enhancing Cross-Problem Vehicle Routing via Federated Learning

arXiv:2604.10652v1 Announce Type: new Abstract: Vehicle routing problems (VRPs) constitute a core optimization challenge in modern logistics and supply chain management. The recent neural combinatorial optimization (NCO) has demonstrated superior efficiency over some traditional algorithms. While serving as a primary NCO approach for solving general VRPs, current cross-problem learning paradigms are still subject to performance […]

April 14, 2026

FedRio: Personalized Federated Social Bot Detection via Cooperative Reinforced Contrastive Adversarial Distillation

arXiv:2604.10678v1 Announce Type: new Abstract: Social bot detection is critical to the stability and security of online social platforms. However, current state-of-the-art bot detection models are largely developed in isolation, overlooking the benefits of leveraging shared detection patterns across platforms to improve performance and promptly identify emerging bot variants. The heterogeneity of data distributions and […]

April 14, 2026

Time is Not a Label: Continuous Phase Rotation for Temporal Knowledge Graphs and Agentic Memory

arXiv:2604.11544v1 Announce Type: cross Abstract: Structured memory representations such as knowledge graphs are central to autonomous agents and other long-lived systems. However, most existing approaches model time as discrete metadata, either sorting by recency (burying old-yet-permanent knowledge), simply overwriting outdated facts, or requiring an expensive LLM call at every ingestion step, leaving them unable to […]

April 14, 2026

Subscribe for Updates