Sinkhorn doubly stochastic attention rank decay analysis

arXiv:2604.07925v1 Announce Type: cross Abstract: The self-attention mechanism is central to the success of Transformer architectures. However, standard row-stochastic attention has been shown to suffer from significant signal degradation across layers. In particular, it can induce rank collapse, resulting in increasingly uniform token representations, as well as entropy collapse, characterized by highly concentrated attention distributions. […]

Quantifying the Spatiotemporal Dynamics of Engineered Cardiac Microbundles

arXiv:2604.07576v1 Announce Type: new Abstract: Brightfield time-lapse imaging is widely used in cardiac tissue engineering, yet the absence of standardized, interpretable analytical frameworks limits reproducibility and cross-platform comparison. We present an open, scalable computational pipeline for quantifying spatiotemporal contractile dynamics in microscopy videos of human induced pluripotent stem cell-derived cardiac microbundles. Building on our open-source […]

QuantumXCT: Learning Interaction-Induced State Transformation in Cell-Cell Communication via Quantum Entanglement and Generative Modeling

arXiv:2604.02203v2 Announce Type: replace-cross Abstract: Inferring cell-cell communication (CCC) from single-cell transcriptomics remains fundamentally limited by reliance on curated ligand-receptor databases, which primarily capture co-expression rather than the system-level effects of signaling on cellular states. Here, we introduce QuantumXCT, a hybrid quantum-classical generative framework that reframes CCC as a problem of learning interaction-induced state transformations […]

From Papers to Property Tables: A Priority-Based LLM Workflow for Materials Data Extraction

arXiv:2604.07584v1 Announce Type: new Abstract: Scientific data are widely dispersed across research articles and are often reported inconsistently across text, tables, and figures, making manual data extraction and aggregation slow and error-prone. We present a prompt-driven, hierarchical workflow that uses a large language model (LLM) to automatically extract and reconstruct structured, shot-level shock-physics experimental records […]

EditCaption: Human-Aligned Instruction Synthesis for Image Editing via Supervised Fine-Tuning and Direct Preference Optimization

arXiv:2604.08213v1 Announce Type: cross Abstract: High-quality training triplets (source-target image pairs with precise editing instructions) are a critical bottleneck for scaling instruction-guided image editing models. Vision-language models (VLMs) are widely used for automated instruction synthesis, but we identify three systematic failure modes in image-pair settings: orientation inconsistency (e.g., left/right confusion), viewpoint ambiguity, and insufficient fine-grained […]

GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

arXiv:2604.07429v1 Announce Type: cross Abstract: Towards an embodied generalist for real-world interaction, Multimodal Large Language Model (MLLM) agents still suffer from challenging latency, sparse feedback, and irreversible mistakes. Video games offer an ideal testbed with rich visual observations and closed-loop interaction, demanding fine-grained perception, long-horizon planning, and precise control. However, systematically evaluating these capabilities is […]

Too long; didn’t solve

arXiv:2604.07593v1 Announce Type: new Abstract: Mathematical benchmarks consisting of a range of mathematics problems are widely used to evaluate the reasoning abilities of large language models, yet little is known about how their structural properties influence model behaviour. In this work, we investigate two structural length variables, prompt length and solution length, and analyse how […]

InstAP: Instance-Aware Vision-Language Pre-Train for Spatial-Temporal Understanding

arXiv:2604.08337v1 Announce Type: cross Abstract: Current vision-language pre-training (VLP) paradigms excel at global scene understanding but struggle with instance-level reasoning due to global-only supervision. We introduce InstAP, an Instance-Aware Pre-training framework that jointly optimizes global vision-text alignment and fine-grained, instance-level contrastive alignment by grounding textual mentions to specific spatial-temporal regions. To support this, we present […]

Beyond Human-Readable: Rethinking Software Engineering Conventions for the Agentic Development Era

arXiv:2604.07502v1 Announce Type: cross Abstract: For six decades, software engineering principles have been optimized for a single consumer: the human developer. The rise of agentic AI development, where LLM-based agents autonomously read, write, navigate, and debug codebases, introduces a new primary consumer with fundamentally different constraints. This paper presents a systematic analysis of human-centric conventions […]

EMSDialog: Synthetic Multi-person Emergency Medical Service Dialogue Generation from Electronic Patient Care Reports via Multi-LLM Agents

arXiv:2604.07549v1 Announce Type: cross Abstract: Conversational diagnosis prediction requires models to track evolving evidence in streaming clinical conversations and decide when to commit to a diagnosis. Existing medical dialogue corpora are largely dyadic or lack the multi-party workflow and annotations needed for this setting. We introduce an ePCR-grounded, topic-flow-based multi-agent generation pipeline that iteratively plans, […]

A Machine Learning Framework for Turbofan Health Estimation via Inverse Problem Formulation

arXiv:2604.08460v1 Announce Type: cross Abstract: Estimating the health state of turbofan engines is a challenging ill-posed inverse problem, hindered by sparse sensing and complex nonlinear thermodynamics. Research in this area remains fragmented, with comparisons limited by the use of unrealistic datasets and insufficient exploration of the exploitation of temporal information. This work investigates how to […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844