arXiv:2605.01810v1 Announce Type: cross Abstract: Gestational Diabetes Mellitus (GDM) is a high-prevalence pregnancy complication that requires accurate early risk stratification to reduce maternal and fetal morbidity. However, real-world clinical deployment of machine learning is hindered by two coupled constraints: (i) label scarcity, where a large fraction of electronic health records (EHR) lack confirmed diagnostic labels, […]
Towards High Fidelity Face Swapping: A Comprehensive Survey and New Benchmark
arXiv:2605.00883v1 Announce Type: cross Abstract: Face swapping has witnessed significant progress in recent years, largely driven by advances in deep generative models such as GANs and diffusion models.Despite these advances, existing methods remain fragmented across different paradigms, and their evaluation is highly inconsistent due to the lack of standardized datasets and protocols. Moreover, prior surveys […]
A Knowledge-Driven LLM-Based Decision-Support System for Explainable Defect Analysis and Mitigation Guidance in Laser Powder Bed Fusion
arXiv:2605.01100v1 Announce Type: new Abstract: This work presents a knowledge-driven decision-support system that integrates structured defect knowledge with LLM-based reasoning to provide explainable defect diagnosis and mitigation guidance in manufacturing, using LPBF as a representative, safety-critical case study. The proposed ontology-integrated LLM-based decision support system for LPBF defect analysis and mitigation guidance is built on […]
DIAGRAMS: A Review Framework for Reasoning-Level Attribution in Diagram QA
arXiv:2605.00905v1 Announce Type: cross Abstract: Diagram question answering (Diagram QA) requires reasoning-level attribution that links each question-answer pair to all visual regions needed to derive the answer, rather than only the region containing the final response. Creating such structured evidence across diagrams, charts, maps, circuits, and infographics is time-consuming, and existing annotation tools tightly couple […]
Linking spatial biology and clinical histology via Haiku
arXiv:2605.00925v1 Announce Type: cross Abstract: Integrating molecular, morphological, and clinical data is essential for basic and translational biomedical research, yet systematic frameworks for jointly modeling these modalities remain limited. Here we present Haiku, a tri-modal contrastive learning model trained on multiplexed immunofluorescence (mIF). It comprises 26.7 million spatial proteomics patches from 3,218 tissue sections across […]
Virtual Speech Therapist: A Clinician-in-the-Loop AI Speech Therapy Agent for Personalized and Supervised Therapy
arXiv:2605.01101v1 Announce Type: new Abstract: This paper develops Virtual Speech Therapist (VST), an intelligent agent-based platform that streamlines stuttering assessment and delivers customized therapy planning through automated and adaptive AI-driven workflows. VST integrates state-of-the-art deep learning-based stuttering classification, and multi-agent large language model (LLM) reasoning to support evidence-based clinical decision-making. The VST begins with the […]
SCARV: Structure-Constrained Aggregation for Stable Sample Ranking in Redundant NLP Datasets
arXiv:2605.00944v1 Announce Type: cross Abstract: Sample-level rankings are increasingly used in data-centric NLP for analysis, filtering, debugging, and curation, yet existing pipelines typically score training examples pointwise and rank them as if they were independent. This assumption is fragile in the presence of exact duplicates, near-duplicates, paraphrases, and other redundant structure common in NLP corpora, […]
Beyond Isolated Investor: Predicting Startup Success via Roleplay-Based Collective Agents
arXiv:2512.22608v3 Announce Type: replace Abstract: Due to the high value and high failure rates of startups, predicting their success is a critical challenge. Existing approaches typically model startup success from a single decision-maker’s perspective, overlooking the collective dynamics that dominate real-world venture capital (VC) decision-making. We propose SimVC-CAS, a collective agent system that simulates VC […]
MedMosaic: A Challenging Large Scale Benchmark of Diverse Medical Audio
arXiv:2605.00969v1 Announce Type: cross Abstract: We present MedMosaic, a medical audio question-answering dataset designed to benchmark language and audio reasoning models under realistic clinical constraints. Medical audio data is difficult to collect due to privacy regulations and high annotation costs arising from domain expertise. Thus, existing benchmarks tend to underrepresent complex medical audio scenarios. To […]
Towards Multi-Agent Autonomous Reasoning in Hydrodynamics
arXiv:2605.01102v1 Announce Type: new Abstract: Single-agent systems (SAS) have become the default pattern for LLM-driven scientific workflows, but routing planning, tool use, and synthesis through a single context window comes with a well-known cost: as tool specifications and observational traces accumulate, the effective context available for each decision shrinks, and end-to-end reliability suffers. We present […]
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification
arXiv:2604.14258v3 Announce Type: replace Abstract: Large language models are typically post-trained using supervised fine-tuning (SFT) and reinforcement learning (RL), yet effectively unifying efficient knowledge injection with robust generalization remains challenging. In this work, we provide a training-dynamics analysis showing that SFT can be interpreted as a special case of policy gradient optimization with an extremely […]
Component-Aware Self-Speculative Decoding in Hybrid Language Models
arXiv:2605.01106v1 Announce Type: cross Abstract: Speculative decoding accelerates autoregressive inference by drafting candidate tokens with a fast model and verifying them in parallel with the target. Self-speculative methods avoid the need for an external drafter but have been studied exclusively in homogeneous Transformer architectures. We introduce component-aware self-speculative decoding, the first method to exploit the […]