Large language models for automated PRISMA 2020 adherence checking

arXiv:2511.16707v1 Announce Type: cross Abstract: Evaluating adherence to PRISMA 2020 guideline remains a burden in the peer review process. To address the lack of shareable benchmarks, we constructed a copyright-aware benchmark of 108 Creative Commons-licensed systematic reviews and evaluated ten large language models (LLMs) across five input formats. In a development cohort, supplying structured PRISMA […]

AutoBackdoor: Automating Backdoor Attacks via LLM Agents

arXiv:2511.16709v1 Announce Type: cross Abstract: Backdoor attacks pose a serious threat to the secure deployment of large language models (LLMs), enabling adversaries to implant hidden behaviors triggered by specific inputs. However, existing methods often rely on manually crafted triggers and static data pipelines, which are rigid, labor-intensive, and inadequate for systematically evaluating modern defense robustness. […]

Concept-Based Interpretability for Toxicity Detection

arXiv:2511.16689v1 Announce Type: cross Abstract: The rise of social networks has not only facilitated communication but also allowed the spread of harmful content. Although significant advances have been made in detecting toxic language in textual data, the exploration of concept-based explanations in toxicity detection remains limited. In this study, we leverage various subtype attributes present […]

RAG-Driven Data Quality Governance for Enterprise ERP Systems

arXiv:2511.16700v1 Announce Type: cross Abstract: Enterprise ERP systems managing hundreds of thousands of employee records face critical data quality challenges when human resources departments perform decentralized manual entry across multiple languages. We present an end-to-end pipeline combining automated data cleaning with LLM-driven SQL query generation, deployed on a production system managing 240,000 employee records over […]

Password Strength Analysis Through Social Network Data Exposure: A Combined Approach Relying on Data Reconstruction and Generative Models

arXiv:2511.16716v1 Announce Type: cross Abstract: Although passwords remain the primary defense against unauthorized access, users often tend to use passwords that are easy to remember. This behavior significantly increases security risks, also due to the fact that traditional password strength evaluation methods are often inadequate. In this discussion paper, we present SODA ADVANCE, a data […]

Patient-level Information Extraction by Consistent Integration of Textual and Tabular Evidence with Bayesian Networks

arXiv:2511.17056v1 Announce Type: new Abstract: Electronic health records (EHRs) form an invaluable resource for training clinical decision support systems. To leverage the potential of such systems in high-risk applications, we need large, structured tabular datasets on which we can build transparent feature-based models. While part of the EHR already contains structured information (e.g. diagnosis codes, […]

SAM 3: Segment Anything with Concepts

arXiv:2511.16719v1 Announce Type: cross Abstract: We present Segment Anything Model (SAM) 3, a unified model that detects, segments, and tracks objects in images and videos based on concept prompts, which we define as either short noun phrases (e.g., “yellow school bus”), image exemplars, or a combination of both. Promptable Concept Segmentation (PCS) takes such prompts […]

Stochastic neutral fractions and the effective population size

arXiv:2502.05062v2 Announce Type: replace-cross Abstract: The dynamics of a general structured population is modelled using a general stochastic differential equation (SDE) with an infinite decomposability property. This property allows the population to be divided into an arbitrary number of allelic components, also known as stochastic neutral fractions. When demographic noise is small, a fast-slow principle […]

The Belief-Desire-Intention Ontology for modelling mental reality and agency

arXiv:2511.17162v1 Announce Type: new Abstract: The Belief-Desire-Intention (BDI) model is a cornerstone for representing rational agency in artificial intelligence and cognitive sciences. Yet, its integration into structured, semantically interoperable knowledge representations remains limited. This paper presents a formal BDI Ontology, conceived as a modular Ontology Design Pattern (ODP) that captures the cognitive architecture of agents […]

Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

arXiv:2511.16786v1 Announce Type: cross Abstract: Multimodal large language models suffer from substantial inference overhead since multimodal KV Cache grows proportionally with the visual input length. Existing multimodal KV Cache compression methods mostly rely on attention score to reduce cache size, which makes them are incompatible with established efficient attention kernels (e.g., FlashAttention) and ignores the […]

Fairness Evaluation of Large Language Models in Academic Library Reference Services

arXiv:2507.04224v3 Announce Type: replace-cross Abstract: As libraries explore large language models (LLMs) for use in virtual reference services, a key question arises: Can LLMs serve all users equitably, regardless of demographics or social status? While they offer great potential for scalable support, LLMs may also reproduce societal biases embedded in their training data, risking the […]

Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?

arXiv:2511.13646v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are reshaping almost all industries, including software engineering. In recent years, a number of LLM agents have been proposed to solve real-world software problems. Such software agents are typically equipped with a suite of coding tools and can autonomously decide the next actions to form complete […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844