关键词: Artificial intelligence ChatGPT Global quality score Information sources Urooncology

Mesh : Male Humans Artificial Intelligence Urologic Neoplasms Testicular Neoplasms Urology Neoplasms, Germ Cell and Embryonal

来  源:   DOI:10.1016/j.clgc.2023.12.017

Abstract:
OpenAI has created ChatGPT, an artificial intelligence language model that has gained considerable recognition for its capacity to produce text responses resembling human language. Consequently, this study seeks to evaluate the effectiveness of ChatGPT\'s responses in addressing publicly accessible queries related to prostate, kidney, bladder, and testicular cancers.
A comprehensive compilation of frequently asked questions (FAQs) pertaining to prostate, bladder, kidney, and testicular cancers was gathered from diverse sources. Additionally, the recommendations outlined in the European Association of Urology (EAU) 2023 Guideline Oncology were consulted. The chosen questions for evaluation were presented to the ChatGPT 4.0 premium version. The quality of ChatGPT responses was appraised using the global quality score (GQS). Each ChatGPT response was independently reviewed by a panel of physicians, who assigned a GQS score to assess its overall quality.
For prostate cancer, 64.6% of the questions had a GQS score of 5, compared to 62.9 % for bladder, 68.1% for kidney, and 63.9% for testicular cancers, whereas none of the responses had a GQS score of 1. Meanwhile, the category with the lowest proportion of responses, with a GQS score of 5 for each disease, was prognosis and follow-up. The mean GQS score of the answers given to EAU guideline questions was statistically significantly lower than the average score of the answers given to FAQs.
ChatGPT is a valuable tool for addressing general inquiries regarding urological cancers, boasting commendable accuracy rates. Nonetheless, its performance in responding to questions aligned with the EAU guideline was deemed unsatisfactory.
摘要:
背景:OpenAI创建了ChatGPT,一种人工智能语言模型,因其产生类似人类语言的文本响应的能力而获得了相当大的认可。因此,这项研究旨在评估ChatGPT的回答在解决与前列腺相关的公开查询方面的有效性,肾,膀胱,和睾丸癌.
方法:与前列腺有关的常见问题(FAQ)的综合汇编,膀胱,肾,睾丸癌来自不同的来源。此外,我们参考了欧洲泌尿外科协会(EAU)2023年肿瘤学指南中概述的建议.所选择的评估问题已提交给ChatGPT4.0高级版本。ChatGPT反应的质量使用全局质量评分(GQS)进行评估。每个ChatGPT反应由一组医生独立审查,他们分配了GQS评分来评估其整体质量。
结果:对于前列腺癌,64.6%的问题的GQS评分为5,而膀胱的GQS评分为62.9%,肾脏占68.1%,睾丸癌占63.9%,而没有一个反应的GQS评分为1。同时,回答比例最低的类别,每种疾病的GQS评分为5分,预后和随访。EAU指南问题答案的平均GQS得分在统计学上显着低于FAQ答案的平均得分。
结论:ChatGPT是解决泌尿系癌症一般问题的有价值的工具,拥有值得称赞的准确率。尽管如此,其在回答与EAU指南一致的问题方面的表现被认为不令人满意.
公众号