May 27, 2026 – Page 21 – dijee Pharma Intelligence

Quantized Keys Steal Attention: Bias Correction for KV-Cache Compression in Video Diffusion

arXiv:2605.26266v1 Announce Type: cross Abstract: Chunk-wise autoregressive video diffusion models rely on a KV cache of previously generated chunks to avoid redundant computation, but this cache quickly becomes a memory bottleneck as videos grow longer. Methods that quantize the KV cache to low bitwidths reduce memory pressure but degrade video quality. We show that a […]

May 27, 2026

Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought Correction

arXiv:2505.11063v3 Announce Type: replace Abstract: LLM-based agents solve complex tasks through iterative reasoning, tool use, and environment interaction, where each intermediate thought directly shapes subsequent actions. Small deviations in these thoughts can therefore propagate into unsafe behaviors, yet existing guardrails typically operate only on final outputs or require intrusive model modifications. We introduce Thought-Aligner, a […]

May 27, 2026

Decoupled Delay Compensation: Enhancing Pre-trained MARL Policies via Learned Dynamics Filtering

arXiv:2605.26286v1 Announce Type: cross Abstract: Real-world multi-agent reinforcement learning (MARL) systems must often operate under stale observations, stochastic communication delays, and intermittent packet loss. Policies trained under idealized synchronous conditions frequently exhibit significant performance degradation in these regimes because they act on outdated feedback. We propose a modular execution-stage state-estimation layer that replaces delayed communicated […]

May 27, 2026

OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling

arXiv:2605.26322v1 Announce Type: new Abstract: Theory of Mind (ToM), the ability to infer others’ knowledge, intentions, and emotions, is commonly evaluated in large language models (LLMs) using end-point question answering, where performance is judged solely by the final answer to a social reasoning query. This paradigm obscures whether the model actually constructs the underlying mental-state […]

May 27, 2026

The Two Boundaries: Why Behavioral AI Governance Fails Structurally

arXiv:2604.27292v3 Announce Type: replace Abstract: Every system that performs effects has two boundaries: what it can do (expressiveness) and what governance covers (governance). In nearly all deployed AI systems, these boundaries are defined independently, creating three regions: governed capabilities (the only useful region), ungoverned capabilities (risk), and governance policies that address non-existent capabilities (theater). Two […]

May 27, 2026

JobBench: Aligning Agent Work With Human Will

arXiv:2605.26329v1 Announce Type: new Abstract: Current benchmarks for occupational AI agents are scoped primarily by economic values, telling a replacement story. We introduce JobBench, which evaluates AI agents on the workflows that experts identify as high-priority for delegation, empowering humans based on their needs instead of replacing them with GDP value. JobBench covers 130 agentic […]

May 27, 2026

Yes, Q-learning Helps Offline In-Context RL

arXiv:2502.17666v4 Announce Type: replace-cross Abstract: Existing offline in-context reinforcement learning (ICRL) methods have predominantly relied on supervised training objectives, which are known to have limitations in offline RL settings. In this study, we explore the integration of RL objectives within an offline ICRL framework. Through experiments on more than 150 GridWorld and MuJoCo environment-derived datasets, […]

May 27, 2026

Managing Uncertainty in LLM-Generated Procedural Knowledge for Virtual Laboratory Planning

arXiv:2605.26333v1 Announce Type: new Abstract: Educational virtual laboratories can make experimental training more scala-ble, adaptive, and accessible, especially when students have limited access to physical laboratory facilities. However, authoring new simulated laboratory procedures remains costly: educators must describe new equipment, define how instruments and materials interact, and specify valid procedural flows that can be executed […]

May 27, 2026

SWAP: Towards Copyright Auditing of Soft Prompts via Sequential Watermarking

arXiv:2511.04711v2 Announce Type: replace-cross Abstract: Large-scale vision-language models, especially CLIP, have demonstrated remarkable performance across diverse downstream tasks. Soft prompts, as carefully crafted modules that efficiently adapt vision-language models to specific tasks, necessitate effective copyright protection. In this paper, we investigate model copyright protection by auditing whether suspicious third-party models incorporate protected soft prompts. While […]

May 27, 2026

FLUIDSPLAT: Reconstructing Physical Fields from Sparse Sensors via Gaussian Primitives

arXiv:2605.18866v2 Announce Type: replace-cross Abstract: Reconstructing continuous flow fields from sparse surface-mounted sensors is central to aerodynamic design, flow control, and digital-twin instrumentation. Existing neural methods for this task typically encode sensor readings into implicit latent codes with little spatial interpretability and limited formal guidance on how representational capacity should scale with observation count. Inspired […]

May 27, 2026

Intelligent Detection and Mitigation of Carpet-Bombing DDoS Attacks in SDN Using Retrieval-Augmented Generation and Large Language Models

arXiv:2605.26307v1 Announce Type: cross Abstract: Software-Defined Networking (SDN) provides flexible and programmable network management; however, its centralized control architecture remains highly vulnerable to Distributed Denial-of-Service (DDoS) attacks, particularly Carpet-Bombing DDoS attacks that distribute malicious traffic across multiple targets to evade conventional detection mechanisms. In this paper, a Retrieval-Augmented Generation (RAG)-based framework is proposed for real-time […]

May 27, 2026

Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL

arXiv:2605.24001v2 Announce Type: replace-cross Abstract: Recent advances in one-step text-to-image generation have enabled real-time synthesis with remarkable efficiency and quality. Previous reinforcement learning methods for one-step generators combine image-space reward optimization with diffusion noisy-space distribution matching. This paradigm brings challenges due to a mismatch between terminal reward optimization and the underlying generative dynamics. As a […]

May 27, 2026

Subscribe for Updates