arXiv:2604.11131v1 Announce Type: new Abstract: Reinforcement learning (RL) is one of the most practical ways to learn from real-life use-cases. Motivated from the cognitive methods used by humans makes it a widely acceptable strategy in the field of artificial intelligence. Most of the environments used for RL are often high-dimensional, and traditional RL algorithms becomes […]
Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory
arXiv:2603.02473v2 Announce Type: replace Abstract: Memory-augmented LLM agents store and retrieve information from prior interactions, yet the relative importance of how memories are written versus how they are retrieved remains unclear. We introduce a diagnostic framework that analyzes how performance differences manifest across write strategies, retrieval methods, and memory utilization behavior, and apply it to […]
Consistency of AI-Generated Exercise Prescriptions: A Repeated Generation Study Using a Large Language Model
arXiv:2604.11287v1 Announce Type: new Abstract: Background: Large language models (LLMs) have been explored as tools for generating personalized exercise prescriptions, yet the consistency of outputs under identical conditions remains insufficiently examined. Objective: This study evaluated the intra-model consistency of LLM-generated exercise prescriptions using a repeated generation design. Methods: Six clinical scenarios were used to generate […]
Agentic Exploration of PDE Spaces using Latent Foundation Models for Parameterized Simulations
arXiv:2604.09584v1 Announce Type: new Abstract: Flow physics and more broadly physical phenomena governed by partial differential equations (PDEs), are inherently continuous, high-dimensional and often chaotic in nature. Traditionally, researchers have explored these rich spatiotemporal PDE solution spaces using laboratory experiments and/or computationally expensive numerical simulations. This severely limits automated and large-scale exploration, unlike domains such […]
From Agent Loops to Structured Graphs:A Scheduler-Theoretic Framework for LLM Agent Execution
arXiv:2604.11378v1 Announce Type: new Abstract: The dominant paradigm for building LLM based agents is the Agent Loop, an iterative cycle where a single language model decides what to do next by reading an ever growing context window. This paradigm has three structural weaknesses: implicit dependencies between steps, unbounded recovery loops, and mutable execution history that […]
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning
arXiv:2505.17022v2 Announce Type: replace-cross Abstract: Visual generation models have made remarkable progress in creating realistic images from text prompts, yet struggle with complex prompts that specify multiple objects with precise spatial relationships and attributes. Effective handling of such prompts requires explicit reasoning about the semantic content and spatial layout. We present GoT-R1, a framework that […]
Integrated information theory: the good, the bad and the misunderstood
arXiv:2604.11482v1 Announce Type: new Abstract: The integrated information theory of consciousness (IIT) is uniquely ambitious in proposing a mathematical formula, derived from apparently fundamental properties of conscious experience, to describe the quantity and quality of consciousness for any physical system that possesses it. IIT has generated considerable debate, which has engendered some misunderstandings and misrepresentations. […]
MobiFlow: Real-World Mobile Agent Benchmarking through Trajectory Fusion
arXiv:2604.09587v1 Announce Type: new Abstract: Mobile agents can autonomously complete user-assigned tasks through GUI interactions. However, existing mainstream evaluation benchmarks, such as AndroidWorld, operate by connecting to a system-level Android emulator and provide evaluation signals based on the state of system resources. In real-world mobile-agent scenarios, however, many third-party applications do not expose system-level APIs […]
Will a Large Complex System be Stable? Revisited
arXiv:2604.11555v1 Announce Type: new Abstract: Over fifty years ago, Robert May applied random matrix theory to show that as ecological systems grow in size, stability decreases. What emerged from this and the critique that followed was decades of what has been called the complexity-stability debate. However, decades of critique over the assumptions that Robert May […]
Understanding Generalization in Role-Playing Models via Information Theory
arXiv:2512.17270v2 Announce Type: replace-cross Abstract: Role-playing models (RPMs) are widely used in real-world applications but underperform when deployed in the wild. This degradation can be attributed to distribution shifts, including user, character, and dialogue compositional shifts. Existing methods like LLM-as-a-judge fall short in providing a fine-grained diagnosis of how these shifts affect RPM generalization, and […]
Agentic Driving Coach: Robustness and Determinism of Agentic AI-Powered Human-in-the-Loop Cyber-Physical Systems
arXiv:2604.11705v1 Announce Type: new Abstract: Foundation models, including large language models (LLMs), are increasingly used for human-in-the-loop (HITL) cyber-physical systems (CPS) because foundation model-based AI agents can potentially interact with both the physical environments and human users. However, the unpredictable behavior of human users and AI agents, in addition to the dynamically changing physical environments, […]
LETGAMES: An LLM-Powered Gamified Approach to Cognitive Training for Patients with Cognitive Impairment
arXiv:2604.09566v1 Announce Type: cross Abstract: The application of games as a therapeutic tool for cognitive training is beneficial for patients with cognitive impairments. However, effective game design for individual patient is resource-intensive. To this end, we propose an LLM-powered method, ours, for automated and personalized therapeutic game design. Inspired by the Dungeons & Dragons, LETGAMES […]