AgentV-RL: Scaling Reward Modeling with Agentic Verifier

arXiv:2604.16004v1 Announce Type: cross Abstract: Verifiers have been demonstrated to enhance LLM reasoning via test-time scaling (TTS). Yet, they face significant challenges in complex domains. Error propagation from incorrect intermediate reasoning can lead to false positives for seemingly plausible solutions, while lacking external grounding makes verifiers unreliable on computation or knowledge-intensive tasks. To address these […]

Robust Synchronisation for Federated Learning in The Face of Correlated Device Failure

arXiv:2604.16090v1 Announce Type: cross Abstract: Probabilistic Synchronous Parallel (PSP) is a technique in distributed learning systems to reduce synchronization bottlenecks by sampling a subset of participating nodes per round. In Federated Learning (FL), where edge devices are often unreliable due to factors including mobility, power constraints, and user activity, PSP helps improve system throughput. However, […]

Towards Rigorous Explainability by Feature Attribution

arXiv:2604.15898v1 Announce Type: new Abstract: For around a decade, non-symbolic methods have been the option of choice when explaining complex machine learning (ML) models. Unfortunately, such methods lack rigor and can mislead human decision-makers. In high-stakes uses of ML, the lack of rigor is especially problematic. One prime example of provable lack of rigor is […]

Prototype-Grounded Concept Models for Verifiable Concept Alignment

arXiv:2604.16076v1 Announce Type: cross Abstract: Concept Bottleneck Models (CBMs) aim to improve interpretability in Deep Learning by structuring predictions through human-understandable concepts, but they provide no way to verify whether learned concepts align with the human’s intended meaning, hurting interpretability. We introduce Prototype-Grounded Concept Models (PGCMs), which ground concepts in learned visual prototypes: image parts […]

AtManRL: Towards Faithful Reasoning via Differentiable Attention Saliency

arXiv:2604.16158v1 Announce Type: cross Abstract: Large language models (LLMs) increasingly rely on chain-of-thought (CoT) reasoning to solve complex tasks. Yet ensuring that the reasoning trace both contributes to and faithfully reflects the processes underlying the model’s final answer, rather than merely accompanying it, remains challenging. We introduce AtManRL, a method that leverages differentiable attention manipulation […]

Machine learning approaches to uncover the neural mechanisms of motivated behaviour: from ADHD to individual differences in effort and reward sensitivity

arXiv:2604.15363v1 Announce Type: new Abstract: Motivated behaviour relies on the brain’s capacity to evaluate effort and reward. Dysregulation within these processes contributes to a spectrum of conditions, from hyperactivity in attention-deficit/hyperactivity disorder (ADHD) to diminished goal-directed behaviour in apathy. This thesis investigates the neural mechanisms underlying ADHD using electroencephalography (EEG) and examines individual differences in […]

Integrating Graphs, Large Language Models, and Agents: Reasoning and Retrieval

arXiv:2604.15951v1 Announce Type: new Abstract: Generative AI, particularly Large Language Models, increasingly integrates graph-based representations to enhance reasoning, retrieval, and structured decision-making. Despite rapid advances, there remains limited clarity regarding when, why, where, and what types of graph-LLM integrations are most appropriate across applications. This survey provides a concise, structured overview of the design choices […]

Harmonizing Multi-Objective LLM Unlearning via Unified Domain Representation and Bidirectional Logit Distillation

arXiv:2604.15482v1 Announce Type: cross Abstract: Large Language Models (LLMs) unlearning is crucial for removing hazardous or privacy-leaking information from the model. Practical LLM unlearning demands satisfying multiple challenging objectives simultaneously: removing undesirable knowledge, preserving general utility, avoiding over-refusal of neighboring concepts, and, crucially, ensuring robustness against adversarial probing attacks. However, existing unlearning methods primarily focus […]

A Two-Stage, Object-Centric Deep Learning Framework for Robust Exam Cheating Detection

arXiv:2604.16234v1 Announce Type: cross Abstract: Academic integrity continues to face the persistent challenge of examination cheating. Traditional invigilation relies on human observation, which is inefficient, costly, and prone to errors at scale. Although some existing AI-powered monitoring systems have been deployed and trusted, many lack transparency or require multi-layered architectures to achieve the desired performance. […]

A Q-learning-based QoS-aware multipath routing protocol in IoMT-based wireless body area network

arXiv:2604.15489v1 Announce Type: cross Abstract: The Internet of Medical Things (IoMT) enables intelligent healthcare services but faces challenges such as dynamic topology, energy constraints, and diverse QoS requirements. This paper proposes QQMR, a Q-learning-based QoS-aware multipath routing method for WBANs. QQMR classifies data into three priority levels and employs adaptive multi-level queuing and fuzzy C-means […]

Weak-Link Optimization for Multi-Agent Reasoning and Collaboration

arXiv:2604.15972v1 Announce Type: new Abstract: LLM-driven multi-agent frameworks address complex reasoning tasks through multi-role collaboration. However, existing approaches often suffer from reasoning instability, where individual agent errors are amplified through collaboration, undermining overall performance. Current research mainly focuses on enhancing high-capability agents or suppressing unreliable outputs to improve framework effectiveness, while systematic identification and reinforcement […]

PolicyBank: Evolving Policy Understanding for LLM Agents

arXiv:2604.15505v1 Announce Type: cross Abstract: LLM agents operating under organizational policies must comply with authorization constraints typically specified in natural language. In practice, such specifications inevitably contain ambiguities and logical or semantic gaps that cause the agent’s behavior to systematically diverge from the true requirements. We ask: by letting an agent evolve its policy understanding […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844