Chatbots

聊天机器人
  • 文章类型: Journal Article
    背景:社区对心理健康(MH)服务的需求继续超过供应。同时,技术发展使使用人工智能授权的对话代理(CA)成为帮助填补这一空白的真正可能性。
    目的:本综述的目的是确定MH护理部门中现有的共情CA设计架构,并根据分类准确性评估其在检测和响应用户情绪方面的技术性能。此外,我们考虑了用于评估MH护理部门内共情CA对用户的可接受性的方法.最后,这篇综述旨在确定MH护理中共情CA的局限性和未来方向。
    方法:在6个学术数据库中进行了系统的文献搜索,以使用涵盖3个主题的搜索词来识别期刊文章和会议记录:“对话代理,“心理健康”,\"和\"同情。“只有讨论对MH护理领域的CA干预措施的研究才有资格参加本次审查,将文本和声音特征视为可能的数据输入。使用适当的偏倚风险和质量工具评估质量。
    结果:共有19篇文章符合所有纳入标准。MH护理中的大多数(12/19,63%)这些移情CA设计是基于机器学习(ML)的,26%(5/19)的混合动力发动机和11%(2/19)的基于规则的系统。在基于ML的CA中,47%(9/19)使用神经网络,基于变压器的架构得到了很好的体现(7/19,37%)。其余16%(3/19)的ML模型未指定。对这些CA的技术评估侧重于响应准确性及其识别能力,预测,并对用户情绪进行分类。虽然单引擎CA表现出良好的准确性,混合动力发动机实现了更高的精度,并提供了更细微的反应。在19项研究中,人类评估在16(84%)中进行,只有5人(26%)直接关注CA的共情特征。所有这些论文都使用自我报告来衡量同理心,包括单个或多个(规模)评级或来自深度访谈的定性反馈。只有1篇(5%)论文包括CA用户和专家的评估,为流程增加更多价值。
    结论:CA设计及其评估的整合对于产生移情CA至关重要。未来的研究应侧重于使用明确的同理心定义和标准化的量表进行同理心测量,理想情况下包括专家评估。此外,用于技术评估和评估的措施的多样性对比较CA绩效提出了挑战,未来的研究也应该解决这个问题。然而,具有良好技术和共情表现的CA已经提供给MH护理服务的用户,显示对新应用程序的承诺,如服务热线。
    BACKGROUND: The demand for mental health (MH) services in the community continues to exceed supply. At the same time, technological developments make the use of artificial intelligence-empowered conversational agents (CAs) a real possibility to help fill this gap.
    OBJECTIVE: The objective of this review was to identify existing empathic CA design architectures within the MH care sector and to assess their technical performance in detecting and responding to user emotions in terms of classification accuracy. In addition, the approaches used to evaluate empathic CAs within the MH care sector in terms of their acceptability to users were considered. Finally, this review aimed to identify limitations and future directions for empathic CAs in MH care.
    METHODS: A systematic literature search was conducted across 6 academic databases to identify journal articles and conference proceedings using search terms covering 3 topics: \"conversational agents,\" \"mental health,\" and \"empathy.\" Only studies discussing CA interventions for the MH care domain were eligible for this review, with both textual and vocal characteristics considered as possible data inputs. Quality was assessed using appropriate risk of bias and quality tools.
    RESULTS: A total of 19 articles met all inclusion criteria. Most (12/19, 63%) of these empathic CA designs in MH care were machine learning (ML) based, with 26% (5/19) hybrid engines and 11% (2/19) rule-based systems. Among the ML-based CAs, 47% (9/19) used neural networks, with transformer-based architectures being well represented (7/19, 37%). The remaining 16% (3/19) of the ML models were unspecified. Technical assessments of these CAs focused on response accuracies and their ability to recognize, predict, and classify user emotions. While single-engine CAs demonstrated good accuracy, the hybrid engines achieved higher accuracy and provided more nuanced responses. Of the 19 studies, human evaluations were conducted in 16 (84%), with only 5 (26%) focusing directly on the CA\'s empathic features. All these papers used self-reports for measuring empathy, including single or multiple (scale) ratings or qualitative feedback from in-depth interviews. Only 1 (5%) paper included evaluations by both CA users and experts, adding more value to the process.
    CONCLUSIONS: The integration of CA design and its evaluation is crucial to produce empathic CAs. Future studies should focus on using a clear definition of empathy and standardized scales for empathy measurement, ideally including expert assessment. In addition, the diversity in measures used for technical assessment and evaluation poses a challenge for comparing CA performances, which future research should also address. However, CAs with good technical and empathic performance are already available to users of MH care services, showing promise for new applications, such as helpline services.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这篇文章探讨了人工智能对科学写作的影响,特别关注其在医院药学中的应用。它分析了增强信息检索的人工智能工具,文献分析,书写质量,和手稿起草。像共识这样的聊天机器人,以及Scite和SciSpace等平台,在科学数据库中实现精确搜索,提供基于证据的回应和参考。SciSpace有助于生成比较表和制定有关研究的查询,而ResearchRabbit绘制科学文献以识别趋势。DeepL和ProWritingAid等工具通过纠正语法来提高写作质量,风格,和抄袭错误。A.R.I.A.加强参考管理,和珍妮AI协助克服作家的障碍。像langchain这样的Python库支持高级语义搜索和代理的创建。尽管他们的好处,人工智能引发了包括偏见在内的伦理问题,错误信息,和抄袭。强调了负责任的使用和专家严格审查的重要性。在医院药房,人工智能可以提高研究和科学交流的效率和精度。药剂师可以使用这些工具来保持更新,提高出版物的质量,优化信息管理,促进临床决策。总之,人工智能是医院药学的有力工具,只要负责任地和道德地使用它。
    The article examines the impact of artificial intelligence on scientific writing, with a particular focus on its application in hospital pharmacy. It analyses artificial intelligence tools that enhance information retrieval, literature analysis, writing quality, and manuscript drafting. Chatbots like Consensus, along with platforms such as Scite and SciSpace, enable precise searches in scientific databases, providing evidence-based responses and references. SciSpace facilitates the generation of comparative tables and the formulation of queries regarding studies, while ResearchRabbit maps the scientific literature to identify trends. Tools like DeepL and ProWritingAid improve writing quality by correcting grammatical, stylistic, and plagiarism errors. A.R.I.A. enhances reference management, and Jenny AI assists in overcoming writer\'s block. Python libraries such as langchain enable advanced semantic searches and the creation of agents. Despite their benefits, artificial intelligence raises ethical concerns including biases, misinformation, and plagiarism. The importance of responsible use and critical review by experts is emphasised. In hospital pharmacy, artificial intelligence can enhance efficiency and precision in research and scientific communication. Pharmacists can use these tools to stay updated, enhance the quality of their publications, optimise information management, and facilitate clinical decision-making. In conclusion, artificial intelligence is a powerful tool for hospital pharmacy, provided it is used responsibly and ethically.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:人工智能和聊天机器人技术在医疗保健中的集成由于其改善患者护理和简化历史记录的潜力而引起了极大的关注。作为人工智能驱动的对话代理,聊天机器人提供了彻底改变历史的机会,需要全面检查它们对医疗实践的影响。
    目的:本系统综述旨在评估角色,有效性,可用性,以及患者在病史记录中接受聊天机器人。它还研究了融入临床实践的潜在挑战和未来机遇。
    方法:系统搜索包括PubMed,Embase,MEDLINE(通过Ovid),中部,Scopus,和开放科学,并涵盖到2024年7月的研究。审查的研究的纳入和排除标准是基于PICOS(参与者,干预措施,比较器,结果,和研究设计)框架。人口包括使用医疗保健聊天机器人进行病史记录的个人。干预措施的重点是旨在促进病史记录的聊天机器人。感兴趣的结果是可行性,接受,以及基于聊天机器人的病史采集的可用性。未报告这些结果的研究被排除。除会议论文外,所有研究设计均符合纳入条件。只考虑了英语学习。对研究持续时间没有具体限制。主要搜索词包括“chatbot*”,“”对话代理*,\"\"虚拟助手,\"\"人工智能聊天机器人,\"\"病史,“和”历史记录。“观察性研究的质量使用STROBE(加强流行病学观察性研究的报告)标准进行分类(例如,样本量,设计,数据收集,和后续行动)。RoB2(风险偏倚)工具评估了随机对照试验(RCTs)中偏倚的领域和水平。
    结果:该综述包括15项观察性研究和3项RCT,以及来自不同医学领域和人群的综合证据。聊天机器人通过有针对性的查询和数据检索系统地收集信息,提高患者参与度和满意度。结果表明,聊天机器人具有很大的历史记录潜力,并且可以通过24/7自动数据收集来提高医疗保健系统的效率和可访问性。偏见评估显示,在15项观察性研究中,5项(33%)研究质量高,5项(33%)研究质量中等,5项(33%)研究质量低。在RCT中,2具有较低的偏见风险,1有高风险。
    结论:本系统综述为使用聊天机器人获取病史的潜在益处和挑战提供了重要见解。纳入的研究表明,聊天机器人可以增加患者的参与度,简化数据收集,改善医疗保健决策。为了有效地融入临床实践,设计用户友好的界面至关重要,确保强大的数据安全,并保持有同情心的患者-医生互动。未来的研究应该集中在改进聊天机器人算法上,提高他们的情绪智力,并将其应用扩展到不同的医疗保健环境,以充分发挥其在现代医学中的潜力。
    背景:PROSPEROCRD42023410312;www.crd.约克。AC.英国/普洛佩罗。
    BACKGROUND: The integration of artificial intelligence and chatbot technology in health care has attracted significant attention due to its potential to improve patient care and streamline history-taking. As artificial intelligence-driven conversational agents, chatbots offer the opportunity to revolutionize history-taking, necessitating a comprehensive examination of their impact on medical practice.
    OBJECTIVE: This systematic review aims to assess the role, effectiveness, usability, and patient acceptance of chatbots in medical history-taking. It also examines potential challenges and future opportunities for integration into clinical practice.
    METHODS: A systematic search included PubMed, Embase, MEDLINE (via Ovid), CENTRAL, Scopus, and Open Science and covered studies through July 2024. The inclusion and exclusion criteria for the studies reviewed were based on the PICOS (participants, interventions, comparators, outcomes, and study design) framework. The population included individuals using health care chatbots for medical history-taking. Interventions focused on chatbots designed to facilitate medical history-taking. The outcomes of interest were the feasibility, acceptance, and usability of chatbot-based medical history-taking. Studies not reporting on these outcomes were excluded. All study designs except conference papers were eligible for inclusion. Only English-language studies were considered. There were no specific restrictions on study duration. Key search terms included \"chatbot*,\" \"conversational agent*,\" \"virtual assistant,\" \"artificial intelligence chatbot,\" \"medical history,\" and \"history-taking.\" The quality of observational studies was classified using the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) criteria (eg, sample size, design, data collection, and follow-up). The RoB 2 (Risk of Bias) tool assessed areas and the levels of bias in randomized controlled trials (RCTs).
    RESULTS: The review included 15 observational studies and 3 RCTs and synthesized evidence from different medical fields and populations. Chatbots systematically collect information through targeted queries and data retrieval, improving patient engagement and satisfaction. The results show that chatbots have great potential for history-taking and that the efficiency and accessibility of the health care system can be improved by 24/7 automated data collection. Bias assessments revealed that of the 15 observational studies, 5 (33%) studies were of high quality, 5 (33%) studies were of moderate quality, and 5 (33%) studies were of low quality. Of the RCTs, 2 had a low risk of bias, while 1 had a high risk.
    CONCLUSIONS: This systematic review provides critical insights into the potential benefits and challenges of using chatbots for medical history-taking. The included studies showed that chatbots can increase patient engagement, streamline data collection, and improve health care decision-making. For effective integration into clinical practice, it is crucial to design user-friendly interfaces, ensure robust data security, and maintain empathetic patient-physician interactions. Future research should focus on refining chatbot algorithms, improving their emotional intelligence, and extending their application to different health care settings to realize their full potential in modern medicine.
    BACKGROUND: PROSPERO CRD42023410312; www.crd.york.ac.uk/prospero.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    数字化和人工智能的出现对社会产生了深远的影响,尤其是在医学领域。数字健康现在已经成为现实,随着越来越多的人使用聊天机器人进行预测或诊断,治疗计划,和监测,以及营养和心理健康支持。最初设计用于各种目的,聊天机器人在医疗领域显示出显著的优势,如多个来源所示。然而,当前文献中存在着相互矛盾的观点,一些消息来源强调了它们的缺点和局限性,特别是在肿瘤学中的应用。这篇最新的评论文章旨在展示聊天机器人在医学和癌症方面的利弊,同时也应对实施中的挑战,提供有关该主题的专家见解。
    The emergence of digitalization and artificial intelligence has had a profound impact on society, especially in the field of medicine. Digital health is now a reality, with an increasing number of people using chatbots for prognostic or diagnostic purposes, therapeutic planning, and monitoring, as well as for nutritional and mental health support. Initially designed for various purposes, chatbots have demonstrated significant advantages in the medical field, as indicated by multiple sources. However, there are conflicting views in the current literature, with some sources highlighting their drawbacks and limitations, particularly in their use in oncology. This state-of-the-art review article seeks to present both the benefits and the drawbacks of chatbots in the context of medicine and cancer, while also addressing the challenges in their implementation, offering expert insights on the subject.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本文研究了使用短信(SMS)干预措施来支持与健康相关的行为。它首先概述了SMS干预研究出版物的历史进展以及美国政府机构提供的各种资金。接下来是叙述性审查,强调短信干预在关键卫生领域的有效性,比如身体活动,饮食和减肥,心理健康,和物质使用,基于已发表的荟萃分析。然后概述了与其他数字方式相比,短信的优势,包括实时收集信息和提供微剂量干预支持的能力。提出了关键的设计元素,以优化跨沟通策略的有效性和纵向参与,心理基础,和行为改变策略。然后我们讨论高级功能,例如生成人工智能改善用户交互的潜力。最后,突出了实施的主要挑战,包括缺乏专门的商业平台,短信技术的隐私和安全问题,将SMS干预与医学信息学系统集成的困难,以及对用户参与度的担忧。拟议的解决方案旨在促进SMS干预的更广泛的应用和有效性。我们希望这些见解可以帮助研究人员和从业人员使用SMS干预措施来改善健康结果并缩小差距。
    This paper examines the use of text message (SMS) interventions for health-related behavioral support. It first outlines the historical progress in SMS intervention research publications and the variety of funds from US government agencies. A narrative review follows, highlighting the effectiveness of SMS interventions in key health areas, such as physical activity, diet and weight loss, mental health, and substance use, based on published meta-analyses. It then outlines advantages of text messaging compared to other digital modalities, including the real-time capability to collect information and deliver microdoses of intervention support. Crucial design elements are proposed to optimize effectiveness and longitudinal engagement across communication strategies, psychological foundations, and behavior change tactics. We then discuss advanced functionalities, such as the potential for generative artificial intelligence to improve user interaction. Finally, major challenges to implementation are highlighted, including the absence of a dedicated commercial platform, privacy and security concerns with SMS technology, difficulties integrating SMS interventions with medical informatics systems, and concerns about user engagement. Proposed solutions aim to facilitate the broader application and effectiveness of SMS interventions. Our hope is that these insights can assist researchers and practitioners in using SMS interventions to improve health outcomes and reducing disparities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    非正式护理人员(IC),包括病人的配偶,近亲,或者朋友,在照顾头颈部癌症(HNC)患者中发挥重要作用。基于AI的聊天机器人可能会提供与护理相关的信息和帮助。这项研究提出了IC和医疗保健专业人员(HCP)关于使用基于AI的聊天机器人照顾HNC患者的观点。共有六个焦点小组与来自瑞典三家大学医院的15个IC和13个HCP进行了合作。该研究揭示了人们对在IC和HCP中使用基于AI的聊天机器人的意图的普遍犹豫。造成这种不情愿的因素包括他们对聊天机器人提供的信息的不信任,过去使用聊天机器人的负面经历,在聊天机器人互动中缺乏人际关系。在设计聊天机器人时,采用整体方法至关重要,确保用户积极参与,并将他们的观点纳入设计过程。
    Informal caregivers (ICs), including the patient\'s spouse, close relatives, or friends, play an important role in caregiving individuals with head and neck cancer (HNC). AI-based chatbots might offer information and assistance related to caregiving. This study presents the viewpoints of ICs and healthcare professionals (HCPs) on using AI-based chatbots in caring for individuals with HNC. A total of six focus groups were conducted with 15 ICs and 13 HCPs from three Swedish university hospitals. The study uncovers a widespread hesitancy toward the intention to use AI-based chatbots among ICs and HCPs. Factors contributing to this reluctance include their distrust in chatbot-provided information, negative past experiences of using chatbots, and lack of human connection in chatbot interactions. Embracing a holistic approach is crucial when designing chatbots, ensuring active user engagement and incorporating their perspectives into the design process.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    临床领域认知智能的现代时代导致了“医疗认知虚拟代理”(MCVA)的兴起,MCVA被标记为以上下文敏感和环境方式与用户交互的智能虚拟助理。它们旨在增强用户的认知能力,从而帮助患者和医学专家提供个性化的医疗保健,如远程健康跟踪,紧急医疗和重症疾病的机器人诊断,在其他人中。这项研究的目的是探讨MCVA的技术方面及其在现代医疗保健中的相关性。
    在这项研究中,对MCVA进行了全面和可解释的分析,并讨论了它们的影响。提出了一种基于人工智能的MCVA系统框架原型。详细介绍了MCVA功能的潜在应用的体系结构工作流程。2023年3月至4月在布巴内斯瓦尔进行了一项新的MCVA相关性调查分析,奥里萨邦,印度了解MCVA在社会中的当前地位。
    调查结果提供了建设性的结果。与医疗保健相关的大多数人表现出他们对MCVA的倾向。城市地区对MCVA的好奇心超过了农村地区。此外,与年轻人相比,老年人更喜欢使用MCVA。医疗决策支持成为MCVA的最首选应用。
    本文建立并验证了MCVA在现代医疗保健中的相关性。研究表明,MCVA未来可能会增长,并且可以在未来几天证明是对医学专家的有效帮助。
    UNASSIGNED: The modern era of cognitive intelligence in clinical space has led to the rise of \'Medical Cognitive Virtual Agents\' (MCVAs) which are labeled as intelligent virtual assistants interacting with users in a context-sensitive and ambient manner. They aim to augment users\' cognitive capabilities thereby helping both patients and medical experts in providing personalized healthcare like remote health tracking, emergency healthcare and robotic diagnosis of critical illness, among others. The objective of this study is to explore the technical aspects of MCVA and their relevance in modern healthcare.
    UNASSIGNED: In this study, a comprehensive and interpretable analysis of MCVAs are presented and their impacts are discussed. A novel system framework prototype based on artificial intelligence for MCVA is presented. Architectural workflow of potential applications of functionalities of MCVAs are detailed. A novel MCVA relevance survey analysis was undertaken during March-April 2023 at Bhubaneswar, Odisha, India to understand the current position of MCVA in society.
    UNASSIGNED: Outcome of the survey delivered constructive results. Majority of people associated with healthcare showed their inclination towards MCVA. The curiosity for MCVA in Urban zone was more than in rural areas. Also, elderly citizens preferred using MCVA more as compared to youths. Medical decision support emerged as the most preferred application of MCVA.
    UNASSIGNED: The article established and validated the relevance of MCVA in modern healthcare. The study showed that MCVA is likely to grow in future and can prove to be an effective assistance to medical experts in coming days.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:虽然病史是诊断疾病的基础,由于资源限制,教学和提供技能反馈可能具有挑战性。因此,虚拟模拟患者和基于网络的聊天机器人已经成为教育工具,随着人工智能(AI)的最新进展,如大型语言模型(LLM),增强了它们的真实性和提供反馈的潜力。
    目的:在我们的研究中,我们旨在评估生成预训练变压器(GPT)4模型的有效性,以对医学生在模拟患者的历史表现提供结构化反馈.
    方法:我们进行了一项前瞻性研究,涉及医学生使用GPT驱动的聊天机器人进行历史学习。为此,我们设计了一个聊天机器人来模拟病人的反应,并提供对学生的全面性的即时反馈。分析了学生与聊天机器人的互动,并将聊天机器人的反馈与人类评估者的反馈进行了比较。我们测量了评估者间的可靠性,并进行了描述性分析以评估反馈的质量。
    结果:研究的大多数参与者都在医学院三年级。我们的分析中总共包括了来自106个对话的1894个问答对。在超过99%的病例中,GPT-4的角色扮演和反应在医学上是合理的。GPT-4与人类评估者之间的评估者间可靠性显示出“几乎完美”的一致性(Cohenκ=0.832)。在45个反馈类别中的8个中,检测到的一致性较低(κ<0.6)突出了模型评估过于具体或与人类判断不同的主题。
    结论:GPT模型在医学生提供的关于历史记录对话的结构化反馈方面是有效的。尽管我们揭示了某些反馈类别的反馈特异性的一些限制,与人类评估者的总体高度一致表明,LLM可以成为医学教育的宝贵工具。我们的发现,因此,倡导在医疗培训中仔细整合人工智能驱动的反馈机制,并在这种情况下使用LLM时突出重要方面。
    BACKGROUND: Although history taking is fundamental for diagnosing medical conditions, teaching and providing feedback on the skill can be challenging due to resource constraints. Virtual simulated patients and web-based chatbots have thus emerged as educational tools, with recent advancements in artificial intelligence (AI) such as large language models (LLMs) enhancing their realism and potential to provide feedback.
    OBJECTIVE: In our study, we aimed to evaluate the effectiveness of a Generative Pretrained Transformer (GPT) 4 model to provide structured feedback on medical students\' performance in history taking with a simulated patient.
    METHODS: We conducted a prospective study involving medical students performing history taking with a GPT-powered chatbot. To that end, we designed a chatbot to simulate patients\' responses and provide immediate feedback on the comprehensiveness of the students\' history taking. Students\' interactions with the chatbot were analyzed, and feedback from the chatbot was compared with feedback from a human rater. We measured interrater reliability and performed a descriptive analysis to assess the quality of feedback.
    RESULTS: Most of the study\'s participants were in their third year of medical school. A total of 1894 question-answer pairs from 106 conversations were included in our analysis. GPT-4\'s role-play and responses were medically plausible in more than 99% of cases. Interrater reliability between GPT-4 and the human rater showed \"almost perfect\" agreement (Cohen κ=0.832). Less agreement (κ<0.6) detected for 8 out of 45 feedback categories highlighted topics about which the model\'s assessments were overly specific or diverged from human judgement.
    CONCLUSIONS: The GPT model was effective in providing structured feedback on history-taking dialogs provided by medical students. Although we unraveled some limitations regarding the specificity of feedback for certain feedback categories, the overall high agreement with human raters suggests that LLMs can be a valuable tool for medical education. Our findings, thus, advocate the careful integration of AI-driven feedback mechanisms in medical training and highlight important aspects when LLMs are used in that context.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:自身免疫性肝病(AILDs)很少见,需要精确评估,这对医疗提供者来说通常是具有挑战性的。Chatbots是帮助医疗保健专业人员进行临床管理的创新解决方案。在我们的研究中,十位肝脏专家系统地评估了四个聊天机器人,以确定它们在AILD领域作为临床决策支持工具的实用性。
    方法:我们构建了一个56个问题的问卷,重点是AILD评估,诊断,自身免疫性肝炎(AIH)的管理,原发性胆道胆管炎(PBC),原发性硬化性胆管炎(PSC)。四个聊天机器人-ChatGPT3.5,克劳德,MicrosoftCopilot,和GoogleBard-于2023年12月在其免费级别中提供了这些问题。使用标准化的1至10李克特量表,由十名肝脏专家对反应进行了严格评估。分析包括平均得分,评级最高的答复数量,以及聊天机器人性能中常见缺点的识别。
    结果:在评估的聊天机器人中,专家对克劳德的评分最高,平均得分为7.37(SD=1.91),其次是ChatGPT(7.17,SD=1.89),MicrosoftCopilot(6.63,SD=2.10),和谷歌吟游诗人(6.52,SD=2.27)。克劳德还出色地获得了27份最佳答复,表现优于ChatGPT(20),而微软Copilot和谷歌巴德分别只有6和9。常见的缺陷包括列出细节而不是具体建议,剂量选择有限,怀孕患者的错误,近期数据不足,过度依赖CT和MRI成像,以及关于PBC治疗中的非标签使用和贝特类药物的讨论不足。值得注意的是,与预先训练的模型相比,MicrosoftCopilot和GoogleBard的互联网访问没有提高精度。
    结论:聊天机器人在AILD支持中拥有承诺,但是我们的研究强调了需要改进的关键领域。在提供具体建议时需要改进,准确度,集中最新信息。解决这些缺点对于提高聊天机器人在AILD管理中的效用至关重要,指导未来发展,并确保其作为临床决策支持工具的有效性。
    OBJECTIVE: Autoimmune liver diseases (AILDs) are rare and require precise evaluation, which is often challenging for medical providers. Chatbots are innovative solutions to assist healthcare professionals in clinical management. In our study, ten liver specialists systematically evaluated four chatbots to determine their utility as clinical decision support tools in the field of AILDs.
    METHODS: We constructed a 56-question questionnaire focusing on AILD evaluation, diagnosis, and management of Autoimmune Hepatitis (AIH), Primary Biliary Cholangitis (PBC), and Primary Sclerosing Cholangitis (PSC). Four chatbots -ChatGPT 3.5, Claude, Microsoft Copilot, and Google Bard- were presented with the questions in their free tiers in December 2023. Responses underwent critical evaluation by ten liver specialists using a standardized 1 to 10 Likert scale. The analysis included mean scores, the number of highest-rated replies, and the identification of common shortcomings in chatbots performance.
    RESULTS: Among the assessed chatbots, specialists rated Claude highest with a mean score of 7.37 (SD = 1.91), followed by ChatGPT (7.17, SD = 1.89), Microsoft Copilot (6.63, SD = 2.10), and Google Bard (6.52, SD = 2.27). Claude also excelled with 27 best-rated replies, outperforming ChatGPT (20), while Microsoft Copilot and Google Bard lagged with only 6 and 9, respectively. Common deficiencies included listing details over specific advice, limited dosing options, inaccuracies for pregnant patients, insufficient recent data, over-reliance on CT and MRI imaging, and inadequate discussion regarding off-label use and fibrates in PBC treatment. Notably, internet access for Microsoft Copilot and Google Bard did not enhance precision compared to pre-trained models.
    CONCLUSIONS: Chatbots hold promise in AILD support, but our study underscores key areas for improvement. Refinement is needed in providing specific advice, accuracy, and focused up-to-date information. Addressing these shortcomings is essential for enhancing the utility of chatbots in AILD management, guiding future development, and ensuring their effectiveness as clinical decision-support tools.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号