Unifying Speech Editing Detection and Content Localization via Prior-Enhanced Audio LLMs

arXiv:2601.21463v2 Announce Type: replace-cross Abstract: Existing speech editing detection (SED) datasets are predominantly constructed using manual splicing or limited editing operations, resulting in restricted diversity and poor coverage of realistic editing scenarios. Meanwhile, current SED methods rely heavily on frame-level supervision to detect observable acoustic anomalies, which fundamentally limits their ability to handle deletion-type edits, […]

Frailty Estimation in Elderly Oncology Patients Using Multimodal Wearable Data and Multi-Instance Learning

arXiv:2604.06985v1 Announce Type: cross Abstract: Frailty and functional decline strongly influence treatment tolerance and outcomes in older patients with cancer, yet assessment is typically limited to infrequent clinic visits. We propose a multimodal wearable framework to estimate frailty-related functional change between visits in elderly breast cancer patients enrolled in the multicenter CARDIOCARE study. Free-living smartwatch […]

Making Room for AI: Multi-GPU Molecular Dynamics with Deep Potentials in GROMACS

arXiv:2604.07276v1 Announce Type: cross Abstract: GROMACS is a de-facto standard for classical Molecular Dynamics (MD). The rise of AI-driven interatomic potentials that pursue near-quantum accuracy at MD throughput now poses a significant challenge: embedding neural-network inference into multi-GPU simulations retaining high-performance. In this work, we integrate the MLIP framework DeePMD-kit into GROMACS, enabling domain-decomposed, GPU-accelerated […]

KITE: Keyframe-Indexed Tokenized Evidence for VLM-Based Robot Failure Analysis

arXiv:2604.07034v1 Announce Type: cross Abstract: We present KITE, a training-free, keyframe-anchored, layout-grounded front-end that converts long robot-execution videos into compact, interpretable tokenized evidence for vision-language models (VLMs). KITE distills each trajectory into a small set of motion-salient keyframes with open-vocabulary detections and pairs each keyframe with a schematic bird’s-eye-view (BEV) representation that encodes relative object […]

AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage

arXiv:2505.20662v3 Announce Type: replace Abstract: Efficient reproduction of research papers is pivotal to accelerating scientific progress. However, the increasing complexity of proposed methods often renders reproduction a labor-intensive endeavor, necessitating profound domain expertise. To address this, we introduce the paper lineage, which systematically mines implicit knowledge from the cited literature. This algorithm serves as the […]

Information as Structural Alignment: A Dynamical Theory of Continual Learning

arXiv:2604.07108v1 Announce Type: cross Abstract: Catastrophic forgetting is not an engineering failure. It is a mathematical consequence of storing knowledge as global parameter superposition. Existing methods, such as regularization, replay, and frozen subnetworks, add external mechanisms to a shared-parameter substrate. None derives retention from the learning dynamics themselves. This paper introduces the Informational Buildup Framework […]

Generative Phomosaic with Structure-Aligned and Personalized Diffusion

arXiv:2604.06989v1 Announce Type: cross Abstract: We present the first generative approach to photomosaic creation. Traditional photomosaic methods rely on a large number of tile images and color-based matching, which limits both diversity and structural consistency. Our generative photomosaic framework synthesizes tile images using diffusion-based generation conditioned on reference images. A low-frequency conditioned diffusion mechanism aligns […]

Energy-based Tissue Manifolds for Longitudinal Multiparametric MRI Analysis

arXiv:2604.07180v1 Announce Type: cross Abstract: We propose a geometric framework for longitudinal multi-parametric MRI analysis based on patient-specific energy modelling in sequence space. Rather than operating on images with spatial networks, each voxel is represented by its multi-sequence intensity vector ($T1$, $T1c$, $T2$, FLAIR, ADC), and a compact implicit neural representation is trained via denoising […]

Mixed-Initiative Context: Structuring and Managing Context for Human-AI Collaboration

arXiv:2604.07121v1 Announce Type: cross Abstract: In the human-AI collaboration area, the context formed naturally through multi-turn interactions is typically flattened into a chronological sequence and treated as a fixed whole in subsequent reasoning, with no mechanism for dynamic organization and management along the collaboration workflow. Yet these contexts differ substantially in lifecycle, structural hierarchy, and […]

Validated Intent Compilation for Constrained Routing in LEO Mega-Constellations

arXiv:2604.07264v1 Announce Type: cross Abstract: Operating LEO mega-constellations requires translating high-level operator intents (“reroute financial traffic away from polar links under 80 ms”) into low-level routing constraints — a task that demands both natural language understanding and network-domain expertise. We present an end-to-end system comprising three components: (1) a GNN cost-to-go router that distills Dijkstra-quality […]

BadImplant: Injection-based Multi-Targeted Graph Backdoor Attack

arXiv:2601.15474v2 Announce Type: replace-cross Abstract: Graph neural network (GNN) have demonstrated exceptional performance in solving critical problems across diverse domains yet remain susceptible to backdoor attacks. Existing studies on backdoor attack for graph classification are limited to single target attack using subgraph replacement based mechanism where the attacker implants only one trigger into the GNN […]

Syntax Is Easy, Semantics Is Hard: Evaluating LLMs for LTL Translation

arXiv:2604.07321v1 Announce Type: cross Abstract: Propositional Linear Temporal Logic (LTL) is a popular formalism for specifying desirable requirements and security and privacy policies for software, networks, and systems. Yet expressing such requirements and policies in LTL remains challenging because of its intricate semantics. Since many security and privacy analysis tools require LTL formulas as input, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844