DDSP-QbE++: Improving Speech Quality for Speech Anonymisation for Atypical Speech

arXiv:2604.09246v1 Announce Type: cross Abstract: Differentiable Digital Signal Processing (DDSP) pipelines for voice conversion rely on subtractive synthesis, where a periodic excitation signal is shaped by a learned spectral envelope to reconstruct the target voice. In DDSP-QbE, the excitation is generated via phase accumulation, producing a sawtooth-like waveform whose abrupt discontinuities introduce aliasing artefacts that […]

ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion

arXiv:2604.09450v1 Announce Type: cross Abstract: Chest X-ray report generation (CXR-RG) has the potential to substantially alleviate radiologists’ workload. However, conventional autoregressive vision–language models (VLMs) suffer from high inference latency due to sequential token decoding. Diffusion-based models offer a promising alternative through parallel generation, but they still require multiple denoising iterations. Compressing multi-step denoising to a […]

EchoTrail-GUI: Building Actionable Memory for GUI Agents via Critic-Guided Self-Exploration

arXiv:2512.19396v3 Announce Type: replace Abstract: Contemporary GUI agents, while increasingly capable due to advances in Large Vision-Language Models (VLMs), often operate with a critical limitation: they treat each task in isolation, lacking a mechanism to systematically learn from past successes. This digital ”amnesia” results in sub-optimal performance, repeated errors, and poor generalization to novel challenges. […]

Reflection of Episodes: Learning to Play Game from Expert and Self Experiences

arXiv:2502.13388v4 Announce Type: replace Abstract: StarCraft II is a complex and dynamic real-time strategy (RTS) game environment, which is very suitable for artificial intelligence and reinforcement learning research. To address the problem of Large Language Model(LLM) learning in complex environments through self-reflection, we propose a Reflection of Episodes(ROE) framework based on expert experience and self-experience. […]

Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models

arXiv:2505.12509v3 Announce Type: replace-cross Abstract: Post-hoc explanations provide transparency and are essential for guiding model optimization, such as prompt engineering and data sanitation. However, applying model-agnostic techniques to Large Language Models (LLMs) is hindered by prohibitive computational costs, rendering these tools dormant for real-world applications. To revitalize model-agnostic interpretability, we propose a budget-friendly proxy framework […]

Act or Escalate? Evaluating Escalation Behavior in Automation with Language Models

arXiv:2604.08588v1 Announce Type: cross Abstract: Effective automation hinges on deciding when to act and when to escalate. We model this as a decision under uncertainty: an LLM forms a prediction, estimates its probability of being correct, and compares the expected costs of acting and escalating. Using this framework across five domains of recorded human decisions-demand […]

Fine-tuning is Not Enough: A Parallel Framework for Collaborative Imitation and Reinforcement Learning in End-to-end Autonomous Driving

arXiv:2603.13842v3 Announce Type: replace-cross Abstract: End-to-end autonomous driving is typically built upon imitation learning (IL), yet its performance is constrained by the quality of human demonstrations. To overcome this limitation, recent methods incorporate reinforcement learning (RL) through sequential fine-tuning. However, such a paradigm remains suboptimal: sequential RL fine-tuning can introduce policy drift and often leads […]

Generalization and Scaling Laws for Mixture-of-Experts Transformers

arXiv:2604.09175v1 Announce Type: cross Abstract: We develop a theory of generalization and scaling for Mixture-of-Experts (MoE) Transformers that cleanly separates emphactive per-input capacity from routing combinatorics. By conditioning on fixed routing patterns and union-bounding across them, we derive a sup-norm covering-number bound whose metric entropy scales with the active parameter budget and incurs a MoE-specific […]

Quantum-like Cognition in Process Theories: An Analysis

arXiv:2604.08604v1 Announce Type: new Abstract: Various effects in human cognition, often considered `non-classical’, have been argued to be most naturally modelled by quantum-like models of decision making. We extend this approach to describe models of cognition and decision-making in general probabilistic process theories, which include both classical probabilistic models and quantum instrument models as special […]

Sustained Impact of Agentic Personalisation in Marketing: A Longitudinal Case Study

arXiv:2604.08621v1 Announce Type: new Abstract: In consumer applications, Customer Relationship Management (CRM) has traditionally relied on the manual optimisation of static, rule-based messaging strategies. While adaptive and autonomous learning systems offer the promise of scalable personalisation, it remains unclear to what extent “human-in-the-loop” oversight is required to sustain performance uplift over time. This paper presents […]

3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding

arXiv:2604.08645v1 Announce Type: cross Abstract: Large multimodal models are increasingly used as the reasoning core of embodied agents operating in 3D environments, yet they remain prone to hallucinations that can produce unsafe and ungrounded decisions. Existing inference-time hallucination mitigation methods largely target 2D vision-language settings and do not transfer to embodied 3D reasoning, where failures […]

From Business Events to Auditable Decisions: Ontology-Governed Graph Simulation for Enterprise AI

arXiv:2604.08603v1 Announce Type: new Abstract: Existing LLM-based agent systems share a common architectural failure: they answer from the unrestricted knowledge space without first simulating how active business scenarios reshape that space for the event at hand — producing decisions that are fluent but ungrounded and carrying no audit trail. We present LOM-action, which equips enterprise […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844