• Home
  • DTx
  • Performance of large language models in delivering accurate and comprehensible patient information on heart failure and cardiomyopathy

BackgroundLarge language models (LLMs) are increasingly used by patients seeking cardiovascular health information through digital platforms. However, their accuracy and suitability for providing guidance on heterogeneous diseases such as cardiomyopathies and heart failure remain inadequately evaluated. This study systematically benchmarked state-of-the-art LLMs on patient-oriented heart failure and cardiomyopathy queries regarding clinical appropriateness and comprehensibility.MethodsSix prominent LLM Chatbots were tested on 50 expert-curated questions covering disease understanding and lifestyle advice. A web-based evaluation platform randomized and blinded responses for assessment by twelve reviewers (cardiologists, medical students, AI auto-graders). Responses were rated on a 1–5 Likert scale across nine domains, including appropriateness, readability, and empathy. Reviewers also chose their preferred model per question.ResultsLinguistic complexity and output length varied substantially. Gemini provided the most readable responses (Flesch–Kincaid Grade 11.3 ± 1.9) but was among the most verbose (668.7 ± 116.1 words). Across 2,700 ratings, Gemini received the highest composite mean rating (4.41 ± 0.77), excelling in completeness and factual reliability, followed by Grok (4.23 ± 0.76). Confabulation avoidance scored consistently high across all models (4.49 ± 0.02), while conciseness scored lowest (3.81 ± 0.05). Consistently, evaluators selected Gemini as their preferred information source in 43.7%, followed by Grok (30.3%). Rating tendencies varied by evaluator group: Auto-graders gave the highest average scores (mean 4.58 ± 0.60), followed by students (4.10 ± 0.88), while experts were more conservative (3.79 ± 0.93).DiscussionAll LLMs showed good accuracy avoiding medical misinformation, though variability exists in readability and comprehensiveness. While major factual errors or hallucinations were rare in our blinded evaluation, they were not entirely absent.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844