arXiv:2603.23146v2 Announce Type: replace-cross Abstract: The widespread adoption of Large Language Models (LLMs) has made the detection of AI-Generated text a pressing and complex challenge. Although many detection systems report high benchmark accuracy, their reliability in real-world settings remains uncertain, and their interpretability is often unexplored. In this work, we investigate whether contemporary detectors genuinely […]
AutoGraph-R1: End-to-End Reinforcement Learning for Knowledge Graph Construction
arXiv:2510.15339v3 Announce Type: cross Abstract: Building effective knowledge graphs (KGs) for Retrieval-Augmented Generation (RAG) is pivotal for advancing question answering (QA) systems. However, its effectiveness is hindered by a fundamental disconnect: the knowledge graph (KG) construction process is decoupled from its downstream application, yielding suboptimal graph structures. To bridge this gap, we introduce AutoGraph-R1, the […]
ORPHEAS: A Cross-Lingual Greek-English Embedding Model for Retrieval-Augmented Generation
arXiv:2604.20666v1 Announce Type: cross Abstract: Effective retrieval-augmented generation across bilingual Greek–English applications requires embedding models capable of capturing both domain-specific semantic relationships and cross-lingual semantic alignment. Existing multilingual embedding models distribute their representational capacity across numerous languages, limiting their optimization for Greek and failing to encode the morphological complexity and domain-specific terminological structures inherent in […]
Self-Describing Structured Data with Dual-Layer Guidance: A Lightweight Alternative to RAG for Precision Retrieval in Large-Scale LLM Knowledge Navigation
arXiv:2604.19777v1 Announce Type: cross Abstract: Large Language Models (LLMs) exhibit a well-documented positional bias when processing long input contexts: information in the middle of a context window receives substantially less attention than content at the boundaries, a phenomenon termed the Lost-in-the-Middle effect (Liu et al., 2024). This limits knowledge-retrieval applications that embed large structured knowledge […]
VAN-AD: Visual Masked Autoencoder with Normalizing Flow For Time Series Anomaly Detection
arXiv:2603.26842v2 Announce Type: replace-cross Abstract: Time series anomaly detection (TSAD) is essential for maintaining the reliability and security of IoT-enabled service systems. Existing methods require training one specific model for each dataset, which exhibits limited generalization capability across different target datasets, hindering anomaly detection performance in various scenarios with scarce training data. To address this […]
AnatomicalNets: A Multi-Structure Segmentation and Contour-Based Distance Estimation Pipeline for Clinically Grounded Lung Cancer T-Staging
arXiv:2511.19367v2 Announce Type: replace-cross Abstract: Accurate tumor staging in lung cancer is crucial for prognosis and treatment planning and is governed by explicit anatomical criteria under fixed guidelines. However, most existing deep learning approaches treat this spatially structured clinical decision as an uninterpretable image classification problem. Tumor stage depends on predetermined quantitative criteria, including the […]
Device-Native Autonomous Agents for Privacy-Preserving Negotiations
arXiv:2601.00911v3 Announce Type: replace-cross Abstract: Automated negotiations in insurance and business-to-business (B2B) commerce encounter substantial challenges. Current systems force a trade-off between convenience and privacy by routing sensitive financial data through centralized servers, increasing security risks, and diminishing user trust. This study introduces a device-native autonomous Agentic AI system for privacy-preserving negotiations. The proposed system […]
AAC: Admissible-by-Architecture Differentiable Landmark Compression for ALT
arXiv:2604.20744v1 Announce Type: new Abstract: We introduce textbfAAC (Architecturally Admissible Compressor), a differentiable landmark-selection module for ALT (A*, Landmarks, and Triangle inequality) shortest-path heuristics whose outputs are admissible by construction: each forward pass is a row-stochastic mixture of triangle-inequality lower bounds, so the heuristic is admissible for emphevery parameter setting without requiring convergence, calibration, or […]
LAFA: A Framework for Reproducible Longitudinal Assessment of Protein Function Annotation Models
arXiv:2604.20782v1 Announce Type: new Abstract: Motivation: Protein function prediction is a challenging task and an open problem in computational biology. The Critical Assessment of protein Function Annotation (CAFA) is a triennial, community-driven initiative that provides an independent, large-scale evaluation of computational methods for protein function prediction through time-delayed benchmarking experiments. CAFA has played a key […]
Coding with Eyes: Visual Feedback Unlocks Reliable GUI Code Generating and Debugging
arXiv:2604.19750v1 Announce Type: cross Abstract: Recent advances in Large Language Model (LLM)-based agents have shown remarkable progress in code generation. However, current agent methods mainly rely on text-output-based feedback (e.g. command-line outputs) for multi-round debugging and struggle in graphical user interface (GUI) that involve visual information. This is mainly due to two limitations: 1) GUI […]
Automated Detection of Dosing Errors in Clinical Trial Narratives: A Multi-Modal Feature Engineering Approach with LightGBM
arXiv:2604.19759v1 Announce Type: new Abstract: Clinical trials require strict adherence to medication protocols, yet dosing errors remain a persistent challenge affecting patient safety and trial integrity. We present an automated system for detecting dosing errors in unstructured clinical trial narratives using gradient boosting with comprehensive multi-modal feature engineering. Our approach combines 3,451 features spanning traditional […]
Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks
arXiv:2604.19755v1 Announce Type: new Abstract: Anti-money laundering (AML) transaction monitoring generates large volumes of alerts that must be rapidly triaged by investigators under strict audit and governance constraints. While large language models (LLMs) can summarize heterogeneous evidence and draft rationales, unconstrained generation is risky in regulated workflows due to hallucinations, weak provenance, and explanations that […]