IntroductionDiabetes management increasingly relies on telehealth platforms in which patients generate structured and unstructured data. This unstructured data, in the form of free-text notes often contain additional information beyond the structured data. Extracting this information can enhance patient profiles and optimize treatment. In particular, the extraction of physical activity information from these notes is considered important. This study evaluates rule-based/regex algorithms and a locally deployed Mistral LLM for physical activity information extraction and data augmentation, with their performances benchmarked against a state-of-the-art GPT-4.1.MethodsData from 943 patients collected over 12 years in the DiabMemory system, supplemented by 100 synthetic notes, were analyzed. Patients’ privacy was preserved by applying a free text pseudonymization algorithm to all notes and by using locally deployed approaches, thereby avoiding third-party cloud services. Three tasks were conducted: (1) extraction of physical activity (PA) data from free-text notes using regex and a locally deployed Mistral LLM, (2) integration of extracted data with structured activity records using a rule-based approach and the local Mistral LLM, and (3) benchmarking local approaches against GPT-4.1 based on the synthetic notes.ResultsBoth local methods achieved strong performance in task 1, with minimum F1-scores of 0.84. In task 2, rule-based augmentation (F1 = 0.73) surpassed the Mistral LLM (F1 = 0.37). Task 3 showed GPT-4.1 outperforming the local LLM but not consistently surpassing regex. The rule-based algorithms also required substantially less computation time than either LLM.DiscussionThe regex algorithm achieved superior accuracy and efficiency but required extensive dataset-specific development, while prompt engineering for the LLM required less knowledge and the development time for regex exceeded that of LLM prompt engineering. Findings of this work generally align with prior studies but are limited by the rather small test set and use of synthetic data.ConclusionsLocal NLP approaches can enhance structured PA data in diabetes telehealth. Rule-based algorithms remain a strong option where computational resources are limited, though future work should validate these findings on larger and more diverse datasets.
Based on dual perspectives of management and ethics: exploring challenges and governance approaches for new media applications in psychiatric specialty hospitals
The further promotion and application of new media technologies present new opportunities for psychiatric specialty hospitals in areas such as health education, doctor-patient communication, service