Compliance-Scored Best-of-N Guardrail Orchestration for Multimodal Document Generation in Payments Dispute Defense

arXiv:2606.01513v1 Announce Type: cross Abstract: High-stakes enterprise document generation, including financial dispute narratives, compliance notices, and audit summaries, demands schema correctness, policy compliance, and low-latency operation at scale. Prior to a unified guardrail layer, production systems often stitched together separate PII redaction, content moderation, and format validation steps, leading to fragmented logic, slower request paths, […]

Attested Tool-Server Admission: A Security Extension to the Model Context Protocol

arXiv:2605.24248v2 Announce Type: replace-cross Abstract: The Model Context Protocol (MCP) standardizes how a large-language-model (LLM) agent and an external tool server exchange messages, but not trust: a host reads a server’s self-declared tool list and dispatches calls, with no notion of which servers it may use, at what sensitivity, or which of a server’s tools […]

Beyond String Matching: Semantic Evaluation of PDF Table Extraction

arXiv:2603.18652v2 Announce Type: replace-cross Abstract: Reliably extracting tables from PDFs is essential for large-scale scientific data mining and knowledge base construction, yet existing evaluation approaches rely on rule-based metrics that fail to capture semantic equivalence of table content. We present a benchmarking framework based on synthetically generated PDFs with precise LaTeX ground truth, using tables […]

Fair Finetuning Mitigates Distribution Inference Attacks

arXiv:2606.01719v1 Announce Type: cross Abstract: Machine learning models trained on sensitive data can inadvertently leak population-level information about their training distributions — a threat known as distribution inference attack (DIA). An adversary with black-box access can infer sensitive demographic properties, such as subgroup proportions, without observing any training data directly. While defenses such as differential […]

Co-Fusion4D: Spatio-temporal Collaborative Fusion for Robust 3D Object Detection

arXiv:2605.20301v2 Announce Type: replace-cross Abstract: In autonomous driving, 3D object detection is essential for accurate perception and reliable decision-making. However, object motion and ego-motion often induce cross-frame spatiotemporal inconsistencies in BEV-based detectors, leading to temporal BEV feature misalignment and degraded spatiotemporal consistency. To address these challenges, we propose Co-Fusion4D, a unified framework that explicitly preserves […]

MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?

arXiv:2606.01993v1 Announce Type: cross Abstract: Abundant procedural knowledge on the Web holds great potential for helping agents solve long-horizon tasks. However, such knowledge is often multimodal, heterogeneous, noisy, and implicitly assumes human executors, making it difficult to use directly as the skills required by agents. To bridge the gap between human-oriented guides and agent-executable skills, […]

Domain-Shift-Aware Conformal Prediction for Large Language Models

arXiv:2510.05566v2 Announce Type: replace-cross Abstract: Large language models have achieved impressive performance across diverse tasks. However, their tendency to produce overconfident and factually incorrect outputs, known as hallucinations, poses risks in real-world applications. Conformal prediction provides finite-sample, distribution-free coverage guarantees, but standard conformal prediction breaks down under domain shift, often leading to under-coverage and unreliable […]

Quantitative Movement Testing: Measuring Patient Movements from a Single Smartphone Video

arXiv:2606.02301v1 Announce Type: cross Abstract: Chronic pain diminishes quality of life by decreasing functional ability, yet objectively measuring this functional impact remains challenging in real-world settings. While optical motion capture provides high precision for assessing altered movement quality, it is costly and restricted to laboratory environments. We aimed to develop and validate Quantitative Movement Testing […]

When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs

arXiv:2602.03554v2 Announce Type: replace-cross Abstract: Recent progress has expanded the use of large language models (LLMs) in drug discovery, including synthesis planning. However, objective evaluation of retrosynthesis performance remains limited. Existing benchmarks and metrics typically rely on published synthetic procedures and Top-K accuracy based on single ground-truth, which does not capture the open-ended nature of […]

Explainable AI Through a Democratic Lens: DhondtXAI for D’Hondt-Projected Feature Attribution

arXiv:2411.05196v3 Announce Type: replace Abstract: This study presents DhondtXAI as a SHAP-independent, D’Hondt-based attribution framework for tabular XAI. Instead of model-native feature importance or SHAP values, DhondtXAI computes background-interventional removal effects, separates positive and negative evidence, forms optional feature alliances, applies optional thresholds, allocates seats via the D’Hondt rule, and projects onto the local model-output […]

MulFeRL: Enhancing Reinforcement Learning with Verbal Feedback in a Multi-turn Loop

arXiv:2601.22900v2 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) is widely used to improve reasoning across domains, but outcome-only scalar rewards are often sparse and uninformative. This limitation is especially severe for failed samples, where scalar rewards indicate only that a solution is incorrect without explaining why the reasoning breaks down. In this […]

PolarMem: A Training-Free Polarized Latent Graph Memory for Verifiable Vision-Language Models

arXiv:2602.00415v2 Announce Type: replace Abstract: Memory is not merely a storage mechanism for intelligent systems, but a structure for organizing evidence and constraining belief. This is especially important for multimodal reasoning, where retrieved evidence must be both query-relevant and visually consistent. However, current memory systems for vision-language models (VLMs) remain largely positive-associative: they retrieve what […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844