Background: Health research that uses predictive and/or generative AI is rapidly growing. Just as in traditional clinical studies, the way in which AI studies are conducted can introduce systematic errors. Transmission of this AI evidence into clinical practice and research needs critical appraisal tools for clinical decision makers and researchers. Objective: To identify existing tools for critical appraisal of clinical studies that use artificial intelligence (AI) and examine the concepts and domains these tools explore. Methods: Inclusion criteria in PCC framework P: (population) Artificial intelligence clinical studies. C (Concept): tools for critical appraisal and associated constructs such as: quality, reporting, validity, risk of bias, and applicability. C (context): in clinical practice context. In addition, bias classification and Chatbot assessment studies were included. We searched in medical and engineering databases (MEDLINE, EMBASE, CINAHL, PsycINFO and IEEE). We included clinical primary research with tools for critical appraisal. Classic reviews and systematic reviews were included in first phase of screening. They were excluded in the secondary phase, after identifying new tools by forward snowballing. We excluded non-human, computer and mathematical research, and letters, opinion papers and editorials. We used Rayyan for screening. Data extraction was done by two observers and discrepancies were solved by discussion. The protocol was previously registered in OSF (https://doi.org/10.17605/OSF.IO/ETYDS). We adhered to the PRISMA extension for Scoping reviews and to the PRISMA-Search extension for Reporting Literature in Systematic Reviews. Results: We retrieved 4393 unique records for screening. After excluding 3803 records, 119 were selected for full-text screening. From these, 59 were excluded. After inclusion of 10 studies via other methods, a total of 70 records were finally included. 46 of them were reporting guidelines (15 tools for critical appraisal, 2 for quality of study and 2 for risk of bias). Nine papers ware focused on bias classification or mitigation. We found 15 Chatbots assessment studies or systematic reviews of Chatbots studies (6 and 9 respectively) which are a very heterogeneous group. Conclusions: The results picture a landscape of the evidence tools where reporting tools predominate, followed by critical appraisal and risk of bias tools, and few tools for risk of bias. The mismatch of bias in AI and epidemiology should be considered for critical appraisal, especially regarding fairness and the mitigation bias in the AI. Finally, Chatbot assessment studies is a vast and evolving field in which progress in design, reporting and critical appraisal is necessary and urgent. Clinical Trial: https://doi.org/10.17605/OSF.IO/ETYDS
Randomized controlled trial to evaluate an app-based multimodal digital intervention for people with type 2 diabetes in comparison to a placebo app
IntroductionThis multi-center, parallel-group randomized controlled trial evaluated the app-based intervention mebix, developed by Vision2b GmbH in Germany, for people with type 2 diabetes compared to



