Development and interpretable machine learning models for classification of pancreatic pseudocyst risk in acute pancreatitis

IntroductionPancreatic pseudocysts (PPC) are a late local complication of acute pancreatitis (AP). Persistent PPC carry a high risk of severe outcomes. Existing models, which are

Implementing AI innovation in radiology departments in the English NHS: a qualitative study on the experiences of professionals, patient groups and innovators

IntroductionDigital solutions and Artificial Intelligence (AI) innovations are often presented as the answer to many challenges faced by healthcare systems around the world. The UK

Development and Evaluation of a Hallucination Awareness Scale for Healthcare Professionals and its impact on diagnostic confidence

Generative artificial intelligence (Gen AI) has gained immense significance in recent years, particularly in the field of healthcare. Despite its significant role in streamlining healthcare-related

Planning and delivering co-creation workshops: practical lessons from digital health device design

Co-creation methods are increasingly recognised as essential in digital health and care, yet engineers and physical scientists new to the field often find the literature

Promises and challenges of applying large language models in the healthcare domain

Large language models are rapidly moving from theoretical concepts to active clinical pilots. Current approaches diverge between general-purpose models, which adapt to healthcare via prompt

Ontology- and LLM-based data harmonization for federated learning in healthcare

March 18, 2026

IntroductionSemantic heterogeneity across electronic health records (EHRs) limits scalable and privacy-preserving analytics in healthcare. While federated learning (FL) enables collaborative modeling without sharing raw data, it requires consistent, ontology-aligned representations. We present an ontology- and large language model (LLM)-based data harmonization approach to support secure, interoperable FL workflows.MethodsWe propose a general two-step pipeline for converting or annotating clinical text into a predefined target ontology format. First, candidate concepts are retrieved from the target vocabulary using embedding-based similarity search or ontology cross-references. Second, an LLM acts as a semantic validator, accepting or rejecting candidates based on explicit equivalence or subsumption criteria. The approach is ontology-agnostic and configurable; mapping to MONDO and HPO is demonstrated as a real-world use case. Final accepted mappings were evaluated against independent human expert assessment.ResultsAcross two clinical datasets, expert-LLM agreement reached up to 92%, with overall performance ranging from 78% to 91% depending on candidate-generation strategy. Retrieval alone was insufficient for reliable mapping, whereas LLM-based validation substantially improved precision while complementary retrieval strategies improved recall.DiscussionThe proposed pipeline transforms ontology-based harmonization from a manual expert task into a reusable and configurable workflow suitable for federated healthcare research. By combining high-recall retrieval with LLM-based semantic adjudication, the approach enables scalable, privacy-preserving conversion of heterogeneous clinical text into standardized representations across domains.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844