Background: The clinical value of artificial intelligence (AI)–based diagnostic systems depends not only on their accuracy but also on how well their outputs integrate with clinicians’ judgments in practice. Critical knowledge gaps remain regarding diagnostic concordance between AI and clinicians in stress echocardiography interpretation, patient characteristics predicting discordance, and how cardiologists respond when AI recommendations conflict with their clinical diagnoses. Objective: This study examined the diagnostic alignment between an AI-driven stress echocardiography system (EchoGo Pro [EGP]) and cardiologists’ diagnoses of coronary artery disease (CAD), identified predictors of concordance and AI scan rejection, and explored cardiologists’ decision-making strategies when disagreements arise. Methods: We conducted mixed methods research. The quantitative study analyzed concordance between EGP and cardiologists using data from 854 participants with suspected CAD in the multicenter PROTEUS randomized controlled trial. Logistic regression identified predictors of agreement, disagreement, and scan rejection, adjusting for age, sex, smoking status, BMI, and cardiovascular risk factors (hypertension, hypercholesterolemia, diabetes, family history of CAD, and prior CAD events). To gain deeper insight into discordance, we conducted a qualitative study analyzing survey responses from 61 UK consultant cardiologists recruited via Qualtrics, exploring their perceptions of AI tools, the risks of following discordant AI recommendations, and their typical responses to AI-clinician disagreement. Results: EGP and cardiologists agreed in 60% (512/854) of the cases, but agreement was significantly lower among patients with hypertension (OR 0.58, 95% CI 0.38‐0.89; =.01), diabetes (OR 0.56, 95% CI 0.35‐0.90; =.02), and pre-existing CAD (OR 0.48, 95% CI 0.30‐0.77; =.002). EGP rejected 26.1% (222/854) of the scans due to insufficient image quality, with rejection significantly more common in male patients (=0.35; =.03) and those with a family history of CAD. If a positive CAD diagnosis was assigned when either cardiologists or EGP identified CAD, the proportion of positive cases increased from 17.9% (153/854) to 22.1% (189/854), potentially identifying additional at-risk patients. Survey respondents (50/60, 85% male; 26/57, 46% aged 40-49 years; 39/61, 64% White) required 65% to 69% confidence in their initial diagnosis to justify disregarding contradictory AI recommendations. The survey findings revealed cardiologists treated AI recommendations as advisory rather than definitive. When facing discordance, they retained confidence in their judgment and sought corroboration through additional testing, data review, or second opinions rather than deferring to AI. Paradoxically, cardiologists with higher confidence in AI tools required greater confidence in their own diagnosis to disregard AI recommendations (=7.73; =.02). Cardiologists attributed discordance primarily to AI’s inability to incorporate patient history, comorbidities, and broader clinical context. Conclusions: EGP shows promise as an adjunctive tool but struggles with multimorbid patients and exhibits high, uneven rejection rates. Cardiologists use AI to prompt scrutiny, not replace judgment. Future systems need to integrate wider patient data with imaging and minimize bias through representative training to avoid exacerbating inequities. Trial Registration: ClinicalTrials.gov NCT05028179; https://clinicaltrials.gov/study/NCT05028179 International Registered Report Identifier (IRRID): RR2-10.1136/bmjopen-2023-079617
Explainable AI in kidney stone detection and segmentation: a mini review
Kidney stones are one of the most common renal disorders that can produce severe complications if not diagnosed and treated early. Recently, advances in AI