Artificial intelligence chatbot

  • 文章类型: Journal Article
    目的:本研究的目的是评估ChatGPT的水平,一个人工智能驱动的聊天机器人,在帮助治疗小儿唾液腺炎和确定何时需要进行鼻内镜检查时进行。
    方法:对49例小儿涎腺炎的临床资料进行回顾性分析。ChatGPT被给予患者数据,它提供了鉴别诊断,提出了进一步的测试,并建议治疗。治疗的耳鼻喉科医生做出的决定与ChatGPT提供的答案进行了对比。对ChatGPT响应一致性和评分者间可靠性进行了分析。
    结果:ChatGPT在主要诊断中显示78.57%的准确率,17.35%的病例被认为是可能的。另一方面,耳鼻喉科医师推荐的进一步检查比ChatGPT少(111vs.60,p<0.001)。对于额外的考试,ChatGPT和耳鼻喉科医师之间的一致性较差。只有28.57%的病例通过ChatGPT接受了相关和必要的治疗计划,这表明该平台的治疗建议经常缺乏。对于治疗评级,法官之间的可靠性最高(肯德尔的tau=0.824,p<0.001)。在大多数情况下,ChatGPT的反应恒定性很高。
    结论:尽管ChatGPT有可能正确诊断小儿涎腺炎,关于其建议进一步检测和治疗方案的能力,存在许多值得注意的局限性.在广泛临床使用之前,需要更多的研究和确认。为了保证聊天机器人得到适当和有效的利用,以补充人类的专业知识,而不是取代它。需要一个批判性的观点。
    OBJECTIVE: The purpose of this study was to assess how well ChatGPT, an AI-powered chatbot, performed in helping to manage pediatric sialadenitis and identify when sialendoscopy was necessary.
    METHODS: 49 clinical cases of pediatric sialadenitis were retrospectively reviewed. ChatGPT was given patient data, and it offered differential diagnoses, proposed further tests, and suggested treatments. The decisions made by the treating otolaryngologists were contrasted with the answers provided by ChatGPT. Analysis was done on ChatGPT response consistency and interrater reliability.
    RESULTS: ChatGPT showed 78.57% accuracy in primary diagnosis, and 17.35% of cases were considered likely. On the other hand, otolaryngologists recommended fewer further examinations than ChatGPT (111 vs. 60, p < 0.001). For additional exams, poor agreement was found between ChatGPT and otolaryngologists. Only 28.57% of cases received a pertinent and essential treatment plan via ChatGPT, indicating that the platform\'s treatment recommendations were frequently lacking. For treatment ratings, judges\' interrater reliability was greatest (Kendall\'s tau = 0.824, p < 0.001). For the most part, ChatGPT\'s response constancy was high.
    CONCLUSIONS: Although ChatGPT has the potential to correctly diagnose pediatric sialadenitis, there are a number of noteworthy limitations with regard to its ability to suggest further testing and treatment regimens. Before widespread clinical use, more research and confirmation are required. To guarantee that chatbots are utilized properly and effectively to supplement human expertise rather than to replace it, a critical viewpoint is required.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:患者和医生越来越多地使用人工智能(AI)应用程序来访问医疗信息。这项研究集中在欧洲泌尿外科协会(EAU)指南的尿石症部分(与肾脏和输尿管结石有关),泌尿科医生的重要参考。
    方法:我们针对四个不同的AI聊天机器人进行了调查,以评估它们对指南依从性的反应。共有115项建议转化为问题,和反应由两名至少5年经验的泌尿科医师使用5分Likert量表(1-False,2-不足,3-足够,4-正确,和5-非常正确)。
    结果:困惑和ChatGPT4.0的平均得分为4.68(SD:0.80)和4.80(SD:0.47),分别,两者的得分都有显著不同的Bing和Bard(Bing与困惑,p<.001;吟游诗人与困惑,p<.001;Bingvs.ChatGPT,p<.001;吟游诗人与ChatGPT,p<.001)。Bing的平均得分为4.21(SD:0.96),而巴德得了3.56分(标准差:1.14),具有显著差异(Bing与巴德,p<.001)。巴德在所有聊天机器人中得分最低。参考文献分析表明,困惑和必应引用指南的频率最高(47.3%和30%,分别)。
    结论:我们的研究结果表明,ChatGPT4.0和,特别是,困惑与EAU指南建议一致。这些不断发展的应用程序可能在将来向医生提供信息方面发挥关键作用。尤其是尿石症.
    OBJECTIVE: Artificial intelligence (AI) applications are increasingly being utilized by both patients and physicians for accessing medical information. This study focused on the urolithiasis section (pertaining to kidney and ureteral stones) of the European Association of Urology (EAU) guideline, a key reference for urologists.
    METHODS: We directed inquiries to four distinct AI chatbots to assess their responses in relation to guideline adherence. A total of 115 recommendations were transformed into questions, and responses were evaluated by two urologists with a minimum of 5 years of experience using a 5-point Likert scale (1 - False, 2 - Inadequate, 3 - Sufficient, 4 - Correct, and 5 - Very correct).
    RESULTS: The mean scores for Perplexity and ChatGPT 4.0 were 4.68 (SD: 0.80) and 4.80 (SD: 0.47), respectively, both significantly differed the scores of Bing and Bard (Bing vs. Perplexity, P<0.001; Bard vs. Perplexity, P<0.001; Bing vs. ChatGPT, P<0.001; Bard vs. ChatGPT, P<0.001). Bing had a mean score of 4.21 (SD: 0.96), while Bard scored 3.56 (SD: 1.14), with a significant difference (Bing vs. Bard, P<0.001). Bard exhibited the lowest score among all chatbots. Analysis of references revealed that Perplexity and Bing cited the guideline most frequently (47.3% and 30%, respectively).
    CONCLUSIONS: Our findings demonstrate that ChatGPT 4.0 and, notably, Perplexity align well with EAU guideline recommendations. These continuously evolving applications may play a crucial role in delivering information to physicians in the future, especially for urolithiasis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号