Lit2Vec: A Reproducible Workflow for Building a Legally Screened Chemistry Corpus from S2ORC for Downstream Retrieval and Text Mining

arXiv:2604.12498v1 Announce Type: cross Abstract: We present Lit2Vec, a reproducible workflow for constructing and validating a chemistry corpus from the Semantic Scholar Open Research Corpus using conservative, metadata-based license screening. Using this workflow, we assembled an internal study corpus of 582,683 chemistry-specific full-text research articles with structured full text, token-aware paragraph chunks, paragraph-level embeddings generated […]

Intelligent ROI-Based Vehicle Counting Framework for Automated Traffic Monitoring

arXiv:2604.12470v1 Announce Type: new Abstract: Accurate vehicle counting through video surveillance is crucial for efficient traffic management. However, achieving high counting accuracy while ensuring computational efficiency remains a challenge. To address this, we propose a fully automated, video-based vehicle counting framework designed to optimize both computational efficiency and counting accuracy. Our framework operates in two […]

MultiDocFusion: Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents

arXiv:2604.12352v1 Announce Type: new Abstract: RAG-based QA has emerged as a powerful method for processing long industrial documents. However, conventional text chunking approaches often neglect complex and long industrial document structures, causing information loss and reduced answer quality. To address this, we introduce MultiDocFusion, a multimodal chunking pipeline that integrates: (i) detection of document regions […]

PromptEcho: Annotation-Free Reward from Vision-Language Models for Text-to-Image Reinforcement Learning

arXiv:2604.12652v1 Announce Type: cross Abstract: Reinforcement learning (RL) can improve the prompt following capability of text-to-image (T2I) models, yet obtaining high-quality reward signals remains challenging: CLIP Score is too coarse-grained, while VLM-based reward models (e.g., RewardDance) require costly human-annotated preference data and additional fine-tuning. We propose PromptEcho, a reward construction method that requires emphno annotation […]

A Scoping Review of Large Language Model-Based Pedagogical Agents

arXiv:2604.12253v1 Announce Type: new Abstract: This scoping review examines the emerging field of Large Language Model (LLM)-based pedagogical agents in educational settings. While traditional pedagogical agents have been extensively studied, the integration of LLMs represents a transformative advancement with unprecedented capabilities in natural language understanding, reasoning, and adaptation. Following PRISMA-ScR guidelines, we analyzed 52 studies […]

Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production

arXiv:2604.12667v1 Announce Type: new Abstract: Human-robot collaborative manufacturing, a core aspect of Industry 5.0, emphasizes ergonomics to enhance worker well-being. This paper addresses the dynamic human-robot task planning and allocation (HRTPA) problem, which involves determining when to perform tasks and who should execute them to maximize efficiency while ensuring workers’ physical fatigue remains within safe […]

The illusory simplicity of the feedforward pass: evidence for the dynamical nature of stimulus encoding along the primate ventral stream

arXiv:2604.12825v1 Announce Type: new Abstract: In studying primate vision, a large body of work focuses on the first feedforward sweep. During this initial time window, information is thought to pass through ventral stream regions in a stage-like fashion in an effort to extract high-level information from the retinal input. Consequently, electrophysiological analyses commonly focus on […]

Contextual Multi-Task Reinforcement Learning for Autonomous Reef Monitoring

arXiv:2604.12645v1 Announce Type: cross Abstract: Although autonomous underwater vehicles promise the capability of marine ecosystem monitoring, their deployment is fundamentally limited by the difficulty of controlling vehicles under highly uncertain and non-stationary underwater dynamics. To address these challenges, we employ a data-driven reinforcement learning approach to compensate for unknown dynamics and task variations.Traditional single-task reinforcement […]

Architecture-Agnostic Modality-Isolated Gated Fusion for Robust Multi-Modal Prostate MRI Segmentation

arXiv:2604.10702v2 Announce Type: replace-cross Abstract: Multi-parametric prostate MRI — combining T2-weighted, apparent diffusion coefficient, and high b-value diffusion-weighted sequences — is central to non-invasive detection of clinically significant prostate cancer, yet in routine practice individual sequences may be missing or degraded by motion, artifacts, or abbreviated protocols. Existing multi-modal fusion strategies typically assume complete inputs […]

TimeSAF: Towards LLM-Guided Semantic Asynchronous Fusion for Time Series Forecasting

arXiv:2604.12648v1 Announce Type: cross Abstract: Despite the recent success of large language models (LLMs) in time-series forecasting, most existing methods still adopt a Deep Synchronous Fusion strategy, where dense interactions between textual and temporal features are enforced at every layer of the network. This design overlooks the inherent granularity mismatch between modalities and leads to […]

Lightweight Low-Light Image Enhancement via Distribution-Normalizing Preprocessing and Depthwise U-Net

arXiv:2604.11071v2 Announce Type: replace-cross Abstract: We present a lightweight two-stage framework for low-light image enhancement (LLIE) that achieves competitive perceptual quality with significantly fewer parameters than existing methods. Our approach combines frozen algorithm-based preprocessing with a compact U-Net built entirely from depthwise-separable convolutions. The preprocessing normalizes the input distribution by providing complementary brightness-corrected views, enabling […]

Learning Chain Of Thoughts Prompts for Predicting Entities, Relations, and even Literals on Knowledge Graphs

arXiv:2604.12651v1 Announce Type: cross Abstract: Knowledge graph embedding (KGE) models perform well on link prediction but struggle with unseen entities, relations, and especially literals, limiting their use in dynamic, heterogeneous graphs. In contrast, pretrained large language models (LLMs) generalize effectively through prompting. We reformulate link prediction as a prompt learning problem and introduce RALP, which […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844