May 22, 2026 – Page 13 – dijee Pharma Intelligence

Planning in the LLM Era: Building for Reliability and Efficiency

arXiv:2605.21902v1 Announce Type: new Abstract: Growing attention to intelligent agents has put a spotlight on one of their central capabilities: planning. Early attempts to leverage large language models (LLMs) for planning relied on single-shot plan generation, followed by hybrid approaches that coupled LLMs with limited external search. These methods, unsound and incomplete by their very […]

May 22, 2026

ACE: Self-Evolving LLM Coding Framework via Adversarial Unit Test Generation and Preference Optimization

arXiv:2605.16299v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) excel at code generation but remain heavily reliant on large-scale annotated solutions and verification-based supervision, which constrains scalability and hinders sustained self-improvement. Recent solver–verifier frameworks exploit program execution as an automatic supervision signal, but their effectiveness degrades as solvers become moderately strong: verifier-generated tests increasingly confirm […]

May 22, 2026

A Characterization of Level-k Realizability for Clustering Systems

arXiv:2605.21945v1 Announce Type: new Abstract: We give a Hasse-diagram characterization of when a clustering system $mathcal C$ on a finite taxa set $X$ is the hardwired clustering system $C_N$ of a rooted level-$k$ network. For each non-trivial block $B$ of $H=mathcal H[mathcal C]$, we define a parameter $mu(B)$ using minimum families of clusters that generate […]

May 22, 2026

Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

arXiv:2605.22321v1 Announce Type: cross Abstract: As autonomous agents (e.g., OpenClaw) increasingly operate with deep system-level privileges to execute complex tasks, they introduce severe, unmitigated security risks. Current vulnerability analyses overwhelmingly focus on single-turn, stateless behaviors, overlooking the expanded attack surface inherent in stateful, multi-turn interactions and dynamic tool invocations. In this paper, we propose a […]

May 22, 2026

Canonical Functionalism: Defining Functional Structure without Observer-Relative Semantic Maps

arXiv:2605.21506v1 Announce Type: new Abstract: Computational functionalism about consciousness is often criticized for relying on observer-relative interpretations of physical systems. This paper proposes a mathematical refinement of functionalism that avoids this problem. The central idea is that consciousness-relevant functional organization should be identified not with arbitrary input-output mappings, semantic labels, or externally imposed computational descriptions, […]

May 22, 2026

Bernini: Latent Semantic Planning for Video Diffusion

arXiv:2605.22344v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) and diffusion models have each reached remarkable maturity: MLLMs excel at reasoning over heterogeneous multimodal inputs with strong semantic grounding, while diffusion models synthesize images and videos with photorealistic fidelity. We argue that these two families can be unified through a simple division of labor: […]

May 22, 2026

AI-Enabled Serious Games: Integrating Intelligence and Adaptivity in Training Systems

arXiv:2605.21962v1 Announce Type: new Abstract: Serious games are widely used for learning and training across domains such as healthcare, defense, and education. Persistent challenges remain, however, including static scenario design, authoring bottlenecks, limited learner modeling, and difficulty implementing meaningful real-time instructional adaptation. Recent advances in artificial intelligence (AI) introduce novel capabilities such as dynamic scenario […]

May 22, 2026

Making the Discrete Continuous: Synthetic RAW Augmentations for Fine-Grained Evaluation of Person Detection Performance in Low Light

arXiv:2605.22455v1 Announce Type: cross Abstract: Real-world deployment of AI vision models is both fueled and limited by the data available for training and testing. Real datasets are sparse and uneven: long-tailed or unbalanced distributions hinder generalization, and the low number of samples in low density regions makes it hard to run evaluations. Synthetic data can […]

May 22, 2026

EmoTrack: Robust Depression Tracking from Counseling Transcripts across Session Regimes

arXiv:2605.22286v1 Announce Type: cross Abstract: Text-based counseling is an important interface for AI mental-health support, where transcripts may be used to monitor depression severity and flag sessions requiring timely human review. However, robust PHQ-8 prediction across session regimes remains challenging: fine-tuning-based methods can exploit richer supervision but may generalize poorly under data scarcity, while prompt-based […]

May 22, 2026

HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools

arXiv:2605.22733v1 Announce Type: new Abstract: Every Python function deployed as an LLM tool must today exist in two forms: an HTTP endpoint for human-facing clients and CI pipelines, and an MCP tool registration for agent runtimes such as Claude and Cursor. These representations share business logic yet diverge in all the surrounding machinery (routing, validation, […]

May 22, 2026

On the Wasserstein Gradient Flow Interpretation of Drifting Models

arXiv:2605.05118v2 Announce Type: replace-cross Abstract: Recently, Deng et al. (2026) proposed Generative Modeling via Drifting (GMD), a novel framework for generative tasks. This note presents an analysis of GMD through the lens of Wasserstein Gradient Flows (WGF), i.e., the path of steepest descent for a functional in the space of probability measures, equipped with the […]

May 22, 2026

MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering

arXiv:2605.22269v1 Announce Type: cross Abstract: Long streaming video QA remains challenging due to growing visual tokens and limited reasoning length of large language models (LLMs). KV-caching stores the Key-Value (KV) of the historical tokens via LLM prefill and enables more efficient streaming QA. However, existing methods cache every one or two frames, causing redundant memory […]

May 22, 2026

Subscribe for Updates