March 24, 2026 – Page 11 – dijee Pharma Intelligence

ContractSkill: Repairable Contract-Based Skills for Multimodal Web Agents

arXiv:2603.20340v1 Announce Type: cross Abstract: Despite rapid progress in multimodal GUI agents, reusable skill acquisition remains difficult because on-demand generated skills often leave action semantics, state assumptions, and success criteria implicit. This makes them brittle to execution errors, hard to verify, and difficult to repair. We present ContractSkill, a framework that converts a draft skill […]

March 24, 2026

PA3: Policy-Aware Agent Alignment through Chain-of-Thought

arXiv:2603.14602v2 Announce Type: replace-cross Abstract: Conversational assistants powered by large language models (LLMs) excel at tool-use tasks but struggle with adhering to complex, business-specific rules. While models can reason over business rules provided in context, including all policies for every query introduces high latency and wastes compute. Furthermore, these lengthy prompts lead to long contexts, […]

March 24, 2026

Deep Reinforcement Learning and The Tale of Two Temporal Difference Errors

arXiv:2603.21921v1 Announce Type: cross Abstract: The temporal difference (TD) error was first formalized in Sutton (1988), where it was first characterized as the difference between temporally successive predictions, and later, in that same work, formulated as the difference between a bootstrapped target and a prediction. Since then, these two interpretations of the TD error have […]

March 24, 2026

LLM-Enhanced Energy Contrastive Learning for Out-of-Distribution Detection in Text-Attributed Graphs

arXiv:2603.20293v1 Announce Type: new Abstract: Text-attributed graphs, where nodes are enriched with textual attributes, have become a powerful tool for modeling real-world networks such as citation, social, and transaction networks. However, existing methods for learning from these graphs often assume that the distributions of training and testing data are consistent. This assumption leads to significant […]

March 24, 2026

AgenticRec: End-to-End Tool-Integrated Policy Optimization for Ranking-Oriented Recommender Agents

arXiv:2603.21613v1 Announce Type: cross Abstract: Recommender agents built on Large Language Models offer a promising paradigm for recommendation. However, existing recommender agents typically suffer from a disconnect between intermediate reasoning and final ranking feedback, and are unable to capture fine-grained preferences. To address this, we present AgenticRec, a ranking-oriented agentic recommendation framework that optimizes the […]

March 24, 2026

Are Your Reasoning Models Reasoning or Guessing? A Mechanistic Analysis of Hierarchical Reasoning Models

arXiv:2601.10679v2 Announce Type: replace Abstract: Hierarchical reasoning model (HRM) achieves extraordinary performance on various reasoning tasks, significantly outperforming large language model-based reasoners. To understand the strengths and potential failure modes of HRM, we conduct a mechanistic study on its reasoning patterns and find three surprising facts: (a) Failure of extremely simple puzzles, e.g., HRM can […]

March 24, 2026

End-to-End Training for Unified Tokenization and Latent Denoising

arXiv:2603.22283v1 Announce Type: cross Abstract: Latent diffusion models (LDMs) enable high-fidelity synthesis by operating in learned latent spaces. However, training state-of-the-art LDMs requires complex staging: a tokenizer must be trained first, before the diffusion model can be trained in the frozen latent space. We propose UNITE – an autoencoder architecture for unified tokenization and latent […]

March 24, 2026

ViCLSR: A Supervised Contrastive Learning Framework with Natural Language Inference for Natural Language Understanding Tasks

arXiv:2603.21084v1 Announce Type: cross Abstract: High-quality text representations are crucial for natural language understanding (NLU), but low-resource languages like Vietnamese face challenges due to limited annotated data. While pre-trained models like PhoBERT and CafeBERT perform well, their effectiveness is constrained by data scarcity. Contrastive learning (CL) has recently emerged as a promising approach for improving […]

March 24, 2026

When Convenience Becomes Risk: A Semantic View of Under-Specification in Host-Acting Agents

arXiv:2603.21231v1 Announce Type: cross Abstract: Host-acting agents promise a convenient interaction model in which users specify goals and the system determines how to realize them. We argue that this convenience introduces a distinct security problem: semantic under-specification in goal specification. User instructions are typically goal-oriented, yet they often leave process constraints, safety boundaries, persistence, and […]

March 24, 2026

Bypassing Document Ingestion: An MCP Approach to Financial Q&A

arXiv:2603.20316v1 Announce Type: cross Abstract: Answering financial questions is often treated as an information retrieval problem. In practice, however, much of the relevant information is already available in curated vendor systems, especially for quantitative analysis. We study whether, and under which conditions, Model Context Protocol (MCP) offers a more reliable alternative to standard retrieval-augmented generation […]

March 24, 2026

CAMA: Exploring Collusive Adversarial Attacks in c-MARL

arXiv:2603.20390v1 Announce Type: cross Abstract: Cooperative multi-agent reinforcement learning (c-MARL) has been widely deployed in real-world applications, such as social robots, embodied intelligence, UAV swarms, etc. Nevertheless, many adversarial attacks still exist to threaten various c-MARL systems. At present, the studies mainly focus on single-adversary perturbation attacks and white-box adversarial attacks that manipulate agents’ internal […]

March 24, 2026

ReBOL: Retrieval via Bayesian Optimization with Batched LLM Relevance Observations and Query Reformulation

arXiv:2603.20513v1 Announce Type: cross Abstract: LLM-reranking is limited by the top-k documents retrieved by vector similarity, which neither enables contextual query-document token interactions nor captures multimodal relevance distributions. While LLM query reformulation attempts to improve recall by generating improved or additional queries, it is still followed by vector similarity retrieval. We thus propose to address […]

March 24, 2026

Subscribe for Updates