%0 Journal Article %T Comparative analysis of artificial intelligence chatbot recommendations for urolithiasis management: A study of EAU guideline compliance. %A Altıntaş E %A Ozkent MS %A Gül M %A Batur AF %A Kaynar M %A Kılıç Ö %A Göktaş S %J Fr J Urol %V 34 %N 7 %D 2024 Jul 5 %M 38849035 暂无%R 10.1016/j.fjurol.2024.102666 %X OBJECTIVE: Artificial intelligence (AI) applications are increasingly being utilized by both patients and physicians for accessing medical information. This study focused on the urolithiasis section (pertaining to kidney and ureteral stones) of the European Association of Urology (EAU) guideline, a key reference for urologists.
METHODS: We directed inquiries to four distinct AI chatbots to assess their responses in relation to guideline adherence. A total of 115 recommendations were transformed into questions, and responses were evaluated by two urologists with a minimum of 5 years of experience using a 5-point Likert scale (1 - False, 2 - Inadequate, 3 - Sufficient, 4 - Correct, and 5 - Very correct).
RESULTS: The mean scores for Perplexity and ChatGPT 4.0 were 4.68 (SD: 0.80) and 4.80 (SD: 0.47), respectively, both significantly differed the scores of Bing and Bard (Bing vs. Perplexity, P<0.001; Bard vs. Perplexity, P<0.001; Bing vs. ChatGPT, P<0.001; Bard vs. ChatGPT, P<0.001). Bing had a mean score of 4.21 (SD: 0.96), while Bard scored 3.56 (SD: 1.14), with a significant difference (Bing vs. Bard, P<0.001). Bard exhibited the lowest score among all chatbots. Analysis of references revealed that Perplexity and Bing cited the guideline most frequently (47.3% and 30%, respectively).
CONCLUSIONS: Our findings demonstrate that ChatGPT 4.0 and, notably, Perplexity align well with EAU guideline recommendations. These continuously evolving applications may play a crucial role in delivering information to physicians in the future, especially for urolithiasis.