arXiv:2604.08639v1 Announce Type: cross Abstract: Uncertainty quantification (UQ) is essential for deploying deep learning models in safety critical applications, yet no consensus exists on which UQ method performs best across different data modalities and distribution shifts. This paper presents a comprehensive benchmark of ten widely used UQ baselines including MC Dropout, SWAG, ensemble methods, temperature […]
Artifacts as Memory Beyond the Agent Boundary
arXiv:2604.08756v1 Announce Type: new Abstract: The situated view of cognition holds that intelligent behavior depends not only on internal memory, but on an agent’s active use of environmental resources. Here, we begin formalizing this intuition within Reinforcement Learning (RL). We introduce a mathematical framing for how the environment can functionally serve as an agent’s memory, […]
Semantic Rate-Distortion for Bounded Multi-Agent Communication: Capacity-Derived Semantic Spaces and the Communication Cost of Alignment
arXiv:2604.09521v1 Announce Type: cross Abstract: When two agents of different computational capacities interact with the same environment, they need not compress a common semantic alphabet differently; they can induce different semantic alphabets altogether. We show that the quotient POMDP $Q_m,T(M)$ – the unique coarsest abstraction consistent with an agent’s capacity – serves as a capacity-derived […]
Fine-tuning is Not Enough: A Parallel Framework for Collaborative Imitation and Reinforcement Learning in End-to-end Autonomous Driving
arXiv:2603.13842v3 Announce Type: replace-cross Abstract: End-to-end autonomous driving is typically built upon imitation learning (IL), yet its performance is constrained by the quality of human demonstrations. To overcome this limitation, recent methods incorporate reinforcement learning (RL) through sequential fine-tuning. However, such a paradigm remains suboptimal: sequential RL fine-tuning can introduce policy drift and often leads […]
Quantum-like Cognition in Process Theories: An Analysis
arXiv:2604.08604v1 Announce Type: new Abstract: Various effects in human cognition, often considered `non-classical’, have been argued to be most naturally modelled by quantum-like models of decision making. We extend this approach to describe models of cognition and decision-making in general probabilistic process theories, which include both classical probabilistic models and quantum instrument models as special […]
ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion
arXiv:2604.09450v1 Announce Type: cross Abstract: Chest X-ray report generation (CXR-RG) has the potential to substantially alleviate radiologists’ workload. However, conventional autoregressive vision–language models (VLMs) suffer from high inference latency due to sequential token decoding. Diffusion-based models offer a promising alternative through parallel generation, but they still require multiple denoising iterations. Compressing multi-step denoising to a […]
Resolving satellite-in situ mismatches in Net Primary Production using high-frequency in situ bio-optical observations in the subpolar Northwest Atlantic
arXiv:2604.08634v1 Announce Type: new Abstract: Net primary productivity (NPP) forms the basis of biological carbon pump, but its estimates in high-latitude regions remain highly uncertain despite its disproportional importance for the global carbon sink. Optical satellites are limited by cloud cover, low irradiance, and shallow light penetration, with uncertainties further exacerbated by the lack of […]
DDSP-QbE++: Improving Speech Quality for Speech Anonymisation for Atypical Speech
arXiv:2604.09246v1 Announce Type: cross Abstract: Differentiable Digital Signal Processing (DDSP) pipelines for voice conversion rely on subtractive synthesis, where a periodic excitation signal is shaped by a learned spectral envelope to reconstruct the target voice. In DDSP-QbE, the excitation is generated via phase accumulation, producing a sawtooth-like waveform whose abrupt discontinuities introduce aliasing artefacts that […]
EchoTrail-GUI: Building Actionable Memory for GUI Agents via Critic-Guided Self-Exploration
arXiv:2512.19396v3 Announce Type: replace Abstract: Contemporary GUI agents, while increasingly capable due to advances in Large Vision-Language Models (VLMs), often operate with a critical limitation: they treat each task in isolation, lacking a mechanism to systematically learn from past successes. This digital ”amnesia” results in sub-optimal performance, repeated errors, and poor generalization to novel challenges. […]
Reflection of Episodes: Learning to Play Game from Expert and Self Experiences
arXiv:2502.13388v4 Announce Type: replace Abstract: StarCraft II is a complex and dynamic real-time strategy (RTS) game environment, which is very suitable for artificial intelligence and reinforcement learning research. To address the problem of Large Language Model(LLM) learning in complex environments through self-reflection, we propose a Reflection of Episodes(ROE) framework based on expert experience and self-experience. […]
Noise-Aware In-Context Learning for Hallucination Mitigation in ALLMs
arXiv:2604.09021v1 Announce Type: cross Abstract: Auditory large language models (ALLMs) have demonstrated strong general capabilities in audio understanding and reasoning tasks. However, their reliability is still undermined by hallucination issues. Existing hallucination evaluation methods are formulated as binary classification tasks, which are insufficient to characterize the more complex hallucination patterns that arise in generative tasks. […]
Sustained Impact of Agentic Personalisation in Marketing: A Longitudinal Case Study
arXiv:2604.08621v1 Announce Type: new Abstract: In consumer applications, Customer Relationship Management (CRM) has traditionally relied on the manual optimisation of static, rule-based messaging strategies. While adaptive and autonomous learning systems offer the promise of scalable personalisation, it remains unclear to what extent “human-in-the-loop” oversight is required to sustain performance uplift over time. This paper presents […]