Deep Reinforcement Learning and The Tale of Two Temporal Difference Errors

From Untamed Black Box to Interpretable Pedagogical Orchestration: The Ensemble of Specialized LLMs Architecture for Adaptive Tutoring

arXiv:2603.23990v1 Announce Type: cross Abstract: Monolithic Large Language Models (LLMs) used in educational dialogue often behave as “black boxes,” where pedagogical decisions are implicit and

Knowledge-Guided Manipulation Using Multi-Task Reinforcement Learning

arXiv:2603.24083v1 Announce Type: cross Abstract: This paper introduces Knowledge Graph based Massively Multi-task Model-based Policy Optimization (KG-M3PO), a framework for multi-task robotic manipulation in partially

LLMs Do Not Grade Essays Like Humans

arXiv:2603.23714v1 Announce Type: new Abstract: Large language models have recently been proposed as tools for automated essay scoring, but their agreement with human grading remains

GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents

arXiv:2603.24329v1 Announce Type: cross Abstract: Multimodal LLMs are increasingly deployed as perceptual backbones for autonomous agents in 3D environments, from robotics to virtual worlds. These

UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

arXiv:2603.24533v1 Announce Type: cross Abstract: Autonomous mobile GUI agents have attracted increasing attention along with the advancement of Multimodal Large Language Models (MLLMs). However, existing