Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought Correction

arXiv:2505.11063v3 Announce Type: replace Abstract: LLM-based agents solve complex tasks through iterative reasoning, tool use, and environment interaction, where each intermediate thought directly shapes subsequent actions. Small deviations in these thoughts can therefore propagate into unsafe behaviors, yet existing guardrails typically operate only on final outputs or require intrusive model modifications. We introduce Thought-Aligner, a […]

Decoupled Delay Compensation: Enhancing Pre-trained MARL Policies via Learned Dynamics Filtering

arXiv:2605.26286v1 Announce Type: cross Abstract: Real-world multi-agent reinforcement learning (MARL) systems must often operate under stale observations, stochastic communication delays, and intermittent packet loss. Policies trained under idealized synchronous conditions frequently exhibit significant performance degradation in these regimes because they act on outdated feedback. We propose a modular execution-stage state-estimation layer that replaces delayed communicated […]

OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling

arXiv:2605.26322v1 Announce Type: new Abstract: Theory of Mind (ToM), the ability to infer others’ knowledge, intentions, and emotions, is commonly evaluated in large language models (LLMs) using end-point question answering, where performance is judged solely by the final answer to a social reasoning query. This paradigm obscures whether the model actually constructs the underlying mental-state […]

The Two Boundaries: Why Behavioral AI Governance Fails Structurally

arXiv:2604.27292v3 Announce Type: replace Abstract: Every system that performs effects has two boundaries: what it can do (expressiveness) and what governance covers (governance). In nearly all deployed AI systems, these boundaries are defined independently, creating three regions: governed capabilities (the only useful region), ungoverned capabilities (risk), and governance policies that address non-existent capabilities (theater). Two […]

JobBench: Aligning Agent Work With Human Will

arXiv:2605.26329v1 Announce Type: new Abstract: Current benchmarks for occupational AI agents are scoped primarily by economic values, telling a replacement story. We introduce JobBench, which evaluates AI agents on the workflows that experts identify as high-priority for delegation, empowering humans based on their needs instead of replacing them with GDP value. JobBench covers 130 agentic […]

Yes, Q-learning Helps Offline In-Context RL

arXiv:2502.17666v4 Announce Type: replace-cross Abstract: Existing offline in-context reinforcement learning (ICRL) methods have predominantly relied on supervised training objectives, which are known to have limitations in offline RL settings. In this study, we explore the integration of RL objectives within an offline ICRL framework. Through experiments on more than 150 GridWorld and MuJoCo environment-derived datasets, […]

Managing Uncertainty in LLM-Generated Procedural Knowledge for Virtual Laboratory Planning

arXiv:2605.26333v1 Announce Type: new Abstract: Educational virtual laboratories can make experimental training more scala-ble, adaptive, and accessible, especially when students have limited access to physical laboratory facilities. However, authoring new simulated laboratory procedures remains costly: educators must describe new equipment, define how instruments and materials interact, and specify valid procedural flows that can be executed […]

SWAP: Towards Copyright Auditing of Soft Prompts via Sequential Watermarking

arXiv:2511.04711v2 Announce Type: replace-cross Abstract: Large-scale vision-language models, especially CLIP, have demonstrated remarkable performance across diverse downstream tasks. Soft prompts, as carefully crafted modules that efficiently adapt vision-language models to specific tasks, necessitate effective copyright protection. In this paper, we investigate model copyright protection by auditing whether suspicious third-party models incorporate protected soft prompts. While […]

FLUIDSPLAT: Reconstructing Physical Fields from Sparse Sensors via Gaussian Primitives

arXiv:2605.18866v2 Announce Type: replace-cross Abstract: Reconstructing continuous flow fields from sparse surface-mounted sensors is central to aerodynamic design, flow control, and digital-twin instrumentation. Existing neural methods for this task typically encode sensor readings into implicit latent codes with little spatial interpretability and limited formal guidance on how representational capacity should scale with observation count. Inspired […]

Intelligent Detection and Mitigation of Carpet-Bombing DDoS Attacks in SDN Using Retrieval-Augmented Generation and Large Language Models

arXiv:2605.26307v1 Announce Type: cross Abstract: Software-Defined Networking (SDN) provides flexible and programmable network management; however, its centralized control architecture remains highly vulnerable to Distributed Denial-of-Service (DDoS) attacks, particularly Carpet-Bombing DDoS attacks that distribute malicious traffic across multiple targets to evade conventional detection mechanisms. In this paper, a Retrieval-Augmented Generation (RAG)-based framework is proposed for real-time […]

Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL

arXiv:2605.24001v2 Announce Type: replace-cross Abstract: Recent advances in one-step text-to-image generation have enabled real-time synthesis with remarkable efficiency and quality. Previous reinforcement learning methods for one-step generators combine image-space reward optimization with diffusion noisy-space distribution matching. This paradigm brings challenges due to a mismatch between terminal reward optimization and the underlying generative dynamics. As a […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844