• Home
  • Uncategorized
  • Automated Identification of Nursing Diagnoses and Interventions From Nursing Records Using a Retrieval-Augmented Large Language Model Approach: Quantitative Study

Background: Electronic health records (EHRs) have been widely adopted, but most nursing records remain in unstructured free-text format, which limits the secondary use of nursing data. Standardized terminologies improve semantic interoperability; however, manual annotation is labor intensive and yields inconsistent results. Advances in large language models (LLMs) and retrieval-augmented generation (RAG) have created new possibilities for automating the mapping of nursing records to standardized terminologies, thereby enhancing the utility of nursing data. Objective: This study aimed to develop and evaluate Clinical Care Classification nursing terminology with retrieval-augmented mapping (CNTRAM), a 2-stage RAG framework incorporating an LLM, for the automated mapping of nursing diagnoses and interventions from free-text intensive care unit nursing records to standardized Clinical Care Classification (CCC) terms. Methods: CNTRAM is a 2-stage retrieval-augmented framework that integrates dense embedding retrieval, retrieval-enhanced prompting, and few-shot LLM guidance to map free-text nursing records to standardized CCC terminology. Free-text records and their segments were embedded as subqueries to retrieve the most relevant CCC reference entries and annotated examples, which were merged to construct context windows. Each subquery was combined with its retrieved context using a predefined RAG prompt template that enforces CCC coding rules and a structured JSON schema and was then processed by an LLM to generate CCC outputs. A gold standard dataset of 100 intensive care unit nursing records was annotated by 3 senior nurses and finalized via consensus, with interrater reliability quantified using the Fleiss κ. Model performance was compared with traditional baselines (term frequency–inverse document frequency, Bidirectional Encoder Representations from Transformer, and fine-tuned Bidirectional Encoder Representations from Transformers model) and 4 LLMs (Mistral-7B, Qwen3-14B, Llama3.3-70B, and DeepSeek-R1) across no-RAG, zero-shot, and few-shot settings, using precision, recall, -score, and intersection over union (IoU) as metrics. Results: Interrater agreement was substantial, with Fleiss κ=0.6449 for diagnoses and κ=0.6180 for interventions. CNTRAM achieved substantial performance gains over all baseline approaches. For nursing diagnoses, DeepSeek-R1 with RAG+few-shot prompting achieved the best performance, with a precision of 0.7909, a recall of 0.7901, an -score of 0.7836, and an IoU of 0.7614. These results were significantly higher than those of traditional baselines (-score 0.0268‐0.2027), no-RAG LLMs (-score 0.0299‐0.0588), and RAG+zero-shot LLMs (-score 0.0716‐0.2160). For nursing interventions, the same configuration achieved a precision of 0.8453, a recall of 0.8504, an -score of 0.8413, and an IoU of 0.8097, outperforming traditional baselines (-score 0.1200‐0.2323), no-RAG LLMs (-score 0.0077‐0.0189), and RAG+zero-shot LLMs (-score 0.2744‐0.4461). Conclusions: This study developed CNTRAM, an LLM-based 2-stage RAG framework that combines dense embedding retrieval and few-shot prompting for CCC terminology mapping. Using DeepSeek-R1, CNTRAM outperformed baseline models, improved mapping accuracy, and provided a feasible solution for standardizing unstructured nursing data.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844