Complementarity-Preserving Generative Theory for Multimodal ECG Synthesis: A Quantum-Inspired Approach

arXiv:2603.26695v1 Announce Type: cross Abstract: Multimodal deep learning has substantially improved electrocardiogram (ECG) classification by jointly leveraging time, frequency, and time-frequency representations. However, existing generative models typically synthesize these modalities independently, resulting in synthetic ECG data that are visually plausible yet physiologically inconsistent across domains. This work establishes a Complementarity-Preserving Generative Theory (CPGT), which posits […]

SutureAgent: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space

arXiv:2603.26720v1 Announce Type: cross Abstract: Predicting surgical needle trajectories from endoscopic video is critical for robot-assisted suturing, enabling anticipatory planning, real-time guidance, and safer motion execution. Existing methods that directly learn motion distributions from visual observations tend to overlook the sequential dependency among adjacent motion steps. Moreover, sparse waypoint annotations often fail to provide sufficient […]

Contextual inference from single objects in Vision-Language models

arXiv:2603.26731v1 Announce Type: cross Abstract: How much scene context a single object carries is a well-studied question in human scene perception, yet how this capacity is organized in vision-language models (VLMs) remains poorly understood, with direct implications for the robustness of these models. We investigate this question through a systematic behavioral and mechanistic analysis of […]

LARD 2.0: Enhanced Datasets and Benchmarking for Autonomous Landing Systems

arXiv:2603.26748v1 Announce Type: cross Abstract: This paper addresses key challenges in the development of autonomous landing systems, focusing on dataset limitations for supervised training of Machine Learning (ML) models for object detection. Our main contributions include: (1) Enhancing dataset diversity, by advocating for the inclusion of new sources such as BingMap aerial images and Flight […]

Entropic Claim Resolution: Uncertainty-Driven Evidence Selection for RAG

arXiv:2603.28444v1 Announce Type: new Abstract: Current Retrieval-Augmented Generation (RAG) systems predominantly rely on relevance-based dense retrieval, sequentially fetching documents to maximize semantic similarity with the query. However, in knowledge-intensive and real-world scenarios characterized by conflicting evidence or fundamental query ambiguity, relevance alone is insufficient for resolving epistemic uncertainty. We introduce Entropic Claim Resolution (ECR), a […]

A Normative Theory of Decision Making from Multiple Stimuli: The Contextual Diffusion Decision Model

arXiv:2603.28600v1 Announce Type: new Abstract: The dynamics of simple two-alternative forced-choice (2AFC) decisions are well-modeled by a class of random walk models (e.g. Laming, 1968; Ratcliff, 1978; Usher & McClelland, 2001; Bogacz et al., 2006). However, in real-life, even simple decisions involve dynamically changing influence of additional information. In this work, we describe a computational […]

Dynamic Dual-Granularity Skill Bank for Agentic RL

arXiv:2603.28716v1 Announce Type: new Abstract: Agentic reinforcement learning (RL) can benefit substantially from reusable experience, yet existing skill-based methods mainly extract trajectory-level guidance and often lack principled mechanisms for maintaining an evolving skill memory. We propose D2Skill, a dynamic dual-granularity skill bank for agentic RL that organizes reusable experience into task skills for high-level guidance […]

M-RAG: Making RAG Faster, Stronger, and More Efficient

arXiv:2603.26667v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) has become a widely adopted paradigm for enhancing the reliability of large language models (LLMs). However, RAG systems are sensitive to retrieval strategies that rely on text chunking to construct retrieval units, which often introduce information fragmentation, retrieval noise, and reduced efficiency. Recent work has even questioned […]

CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform

arXiv:2602.08482v2 Announce Type: replace-cross Abstract: Vessel trajectory data from the Automatic Identification System (AIS) is used widely in maritime analytics. Yet, analysis is difficult for non-expert users due to the incompleteness and complexity of AIS data. We present CLEAR, a knowledge-centric vessel trajectory analysis platform that aims to overcome these barriers. By leveraging the reasoning […]

A Deep Reinforcement Learning Framework for Closed-loop Guidance of Fish Schools via Virtual Agents

arXiv:2603.28200v1 Announce Type: cross Abstract: Guiding collective motion in biological groups is a fundamental challenge in understanding social interaction rules and developing automated systems for animal management. In this study, we propose a deep reinforcement learning (RL) framework for the closed-loop guidance of fish schools using virtual agents. These agents are controlled by policies trained […]

Quid est VERITAS? A Modular Framework for Archival Document Analysis

arXiv:2603.28108v1 Announce Type: cross Abstract: The digitisation of historical documents has traditionally been conceived as a process limited to character-level transcription, producing flat text that lacks the structural and semantic information necessary for substantive computational analysis. We present VERITAS (Vision-Enhanced Reading, Interpretation, and Transcription of Archival Sources), a modular, model-agnostic framework that reconceptualises digitisation as […]

JaWildText: A Benchmark for Vision-Language Models on Japanese Scene Text Understanding

arXiv:2603.27942v2 Announce Type: cross Abstract: Japanese scene text poses challenges that multilingual benchmarks often fail to capture, including mixed scripts, frequent vertical writing, and a character inventory far larger than the Latin alphabet. Although Japanese is included in several multilingual benchmarks, these resources do not adequately capture the language-specific complexities. Meanwhile, existing Japanese visual text […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844