Chatbot

聊天机器人
  • 文章类型: Editorial
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:开发基于GPT-3.5-Turbo和GPT-4的内容感知聊天机器人,并具有德国S2锥束CT(CBCT)牙科成像指南的专业知识,并比较其性能与人类。
    方法:LlamaIndex软件库用于将指南上下文集成到聊天机器人中。根据CBCTS2指南,向内容感知的聊天机器人提出了40个问题,并以具有不同经验水平的早期职业和高级从业者作为参考。在推荐准确性和解释质量方面比较了聊天机器人的性能。卡方检验和单尾Wilcoxon符号秩检验评估准确性和解释质量,分别。
    结果:与基于GPT3.5-Turbo的聊天机器人相比,基于GPT-4的聊天机器人提供了100%正确的建议和出色的解释质量(87.5%vs.GPT-3.5-Turbo为57.5%;p=0.003)。此外,它的正确答案优于早期职业从业者(p=0.002和p=0.032),并且比使用GPT-3.5-Turbo(p=0.006)的聊天机器人获得更高的信任。
    结论:使用GPT-4的内容感知聊天机器人根据当前的共识指南可靠地提供了建议。这些回应被认为是可信和透明的,因此有助于将人工智能整合到临床决策中。
    OBJECTIVE: To develop a content-aware chatbot based on GPT-3.5-Turbo and GPT-4 with specialized knowledge on the German S2 Cone-Beam CT (CBCT) dental imaging guideline and to compare the performance against humans.
    METHODS: The LlamaIndex software library was used to integrate the guideline context into the chatbots. Based on the CBCT S2 guideline, 40 questions were posed to content-aware chatbots and early career and senior practitioners with different levels of experience served as reference. The chatbots\' performance was compared in terms of recommendation accuracy and explanation quality. Chi-square test and one-tailed Wilcoxon signed rank test evaluated accuracy and explanation quality, respectively.
    RESULTS: The GPT-4 based chatbot provided 100% correct recommendations and superior explanation quality compared to the one based on GPT3.5-Turbo (87.5% vs. 57.5% for GPT-3.5-Turbo; P = .003). Moreover, it outperformed early career practitioners in correct answers (P = .002 and P = .032) and earned higher trust than the chatbot using GPT-3.5-Turbo (P = 0.006).
    CONCLUSIONS: A content-aware chatbot using GPT-4 reliably provided recommendations according to current consensus guidelines. The responses were deemed trustworthy and transparent, and therefore facilitate the integration of artificial intelligence into clinical decision-making.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    结论:我们创建了一个LangChain/OpenAIAPI驱动的聊天机器人,仅基于过敏和鼻窦炎(ICAR-RS)的国际共识声明。ICAR-RS聊天机器人能够提供直接和可行的建议。使用共识声明为AI在医疗保健中的应用提供了机会。
    CONCLUSIONS: We created a LangChain/OpenAI API-powered chatbot based solely on International Consensus Statement of Allergy and Rhinology: Rhinosinusitis (ICAR-RS). The ICAR-RS chatbot is able to provide direct and actionable recommendations. Utilization of consensus statements provides an opportunity for AI applications in healthcare.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:创新的大型语言模型(LLM)驱动的聊天机器人,现在非常流行,代表公众复苏信息的潜在来源。例如,聊天机器人生成的建议可用于社区复苏教育或在现实生活中对未经训练的非专业救援人员提供及时的信息支持。
    目的:本研究的重点是评估两个著名的基于LLM的聊天机器人的性能,特别是在聊天机器人生成的关于如何为无呼吸的受害者提供帮助的建议的质量方面。
    方法:2023年5月,新的Bing(微软公司,美国)和巴德(谷歌有限责任公司,USA)聊天机器人被询问(每个n=20):\“如果有人没有呼吸怎么办?\”使用预先开发的清单评估了聊天机器人的内容,以符合2021年复苏委员会英国指南。
    结果:两个聊天机器人都为查询提供了上下文相关的文本响应。然而,在回答中,与指南一致的对无呼吸患者的帮助说明的覆盖率很低:Bing和Bard完全满足检查表标准的回答的平均百分比分别为9.5%和11.4%(P>.05).旁观者行动的基本要素,包括早期开始和不间断地进行足够深度的胸部按压,rate,和胸部后坐力,以及对自动体外除颤器(AED)的要求和使用,通常是失踪的。此外,巴德55.0%的回答包含似是而非的声音,但是荒谬的指导,称为人工幻觉,这会给受害者带来护理不足和伤害的风险。
    结论:LLM驱动的聊天机器人关于帮助无呼吸受害者的建议忽略了复苏技术的基本细节,偶尔包含欺骗性,潜在有害的指令。需要进一步的研究和监管措施来减轻与聊天机器人产生的公众关于复苏的错误信息相关的风险。
    BACKGROUND: Innovative large language model (LLM)-powered chatbots, which are extremely popular nowadays, represent potential sources of information on resuscitation for the general public. For instance, the chatbot-generated advice could be used for purposes of community resuscitation education or for just-in-time informational support of untrained lay rescuers in a real-life emergency.
    OBJECTIVE: This study focused on assessing performance of two prominent LLM-based chatbots, particularly in terms of quality of the chatbot-generated advice on how to give help to a non-breathing victim.
    METHODS: In May 2023, the new Bing (Microsoft Corporation, USA) and Bard (Google LLC, USA) chatbots were inquired (n = 20 each): \"What to do if someone is not breathing?\" Content of the chatbots\' responses was evaluated for compliance with the 2021 Resuscitation Council United Kingdom guidelines using a pre-developed checklist.
    RESULTS: Both chatbots provided context-dependent textual responses to the query. However, coverage of the guideline-consistent instructions on help to a non-breathing victim within the responses was poor: mean percentage of the responses completely satisfying the checklist criteria was 9.5% for Bing and 11.4% for Bard (P >.05). Essential elements of the bystander action, including early start and uninterrupted performance of chest compressions with adequate depth, rate, and chest recoil, as well as request for and use of an automated external defibrillator (AED), were missing as a rule. Moreover, 55.0% of Bard\'s responses contained plausible sounding, but nonsensical guidance, called artificial hallucinations, that create risk for inadequate care and harm to a victim.
    CONCLUSIONS: The LLM-powered chatbots\' advice on help to a non-breathing victim omits essential details of resuscitation technique and occasionally contains deceptive, potentially harmful directives. Further research and regulatory measures are required to mitigate risks related to the chatbot-generated misinformation of public on resuscitation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:ChatGPT-4是一种新颖的人工智能(AI)聊天机器人的最新版本,能够回答自由制定和复杂的问题。在不久的将来,ChatGPT可能成为医疗保健专业人员和患者获取医疗信息的新标准。然而,人们对人工智能提供的医疗信息的质量知之甚少。
    目的:我们旨在评估ChatGPT提供的医疗信息的可靠性。
    方法:ChatGPT-4提供的关于全球疾病负担最高的肝胰胆(HPB)疾病的医学信息是用确保患者质量信息(EQIP)工具测量的。EQIP工具用于衡量互联网可用信息的质量,由36个项目组成,分为3个子部分。此外,每个分析条件的5个指南建议被重新表述为问题和ChatGPT的输入,指南和AI答案之间的一致性由2位作者独立衡量。将所有查询重复3次以测量ChatGPT的内部一致性。
    结果:确定了五个条件(胆结石病,胰腺炎,肝硬化,胰腺癌,和肝细胞癌)。所有条件下的EQIP评分中位数为16(IQR14.5-18),共36个项目。按小节划分,内容的中位数分数,identification,结构数据为10(IQR9.5-12.5),1(IQR1-1),和4(IQR4-5),分别。指南建议与ChatGPT提供的答案之间的一致性为60%(15/25)。Fleissκ测量的评分者间一致性为0.78(P<.001),表明实质性的协议。ChatGPT提供的答案的内部一致性为100%。
    结论:ChatGPT提供的医疗信息质量与现有的静态互联网信息相当。虽然目前质量有限,大型语言模型可能成为患者和医疗保健专业人员收集医疗信息的未来标准。
    ChatGPT-4 is the latest release of a novel artificial intelligence (AI) chatbot able to answer freely formulated and complex questions. In the near future, ChatGPT could become the new standard for health care professionals and patients to access medical information. However, little is known about the quality of medical information provided by the AI.
    We aimed to assess the reliability of medical information provided by ChatGPT.
    Medical information provided by ChatGPT-4 on the 5 hepato-pancreatico-biliary (HPB) conditions with the highest global disease burden was measured with the Ensuring Quality Information for Patients (EQIP) tool. The EQIP tool is used to measure the quality of internet-available information and consists of 36 items that are divided into 3 subsections. In addition, 5 guideline recommendations per analyzed condition were rephrased as questions and input to ChatGPT, and agreement between the guidelines and the AI answer was measured by 2 authors independently. All queries were repeated 3 times to measure the internal consistency of ChatGPT.
    Five conditions were identified (gallstone disease, pancreatitis, liver cirrhosis, pancreatic cancer, and hepatocellular carcinoma). The median EQIP score across all conditions was 16 (IQR 14.5-18) for the total of 36 items. Divided by subsection, median scores for content, identification, and structure data were 10 (IQR 9.5-12.5), 1 (IQR 1-1), and 4 (IQR 4-5), respectively. Agreement between guideline recommendations and answers provided by ChatGPT was 60% (15/25). Interrater agreement as measured by the Fleiss κ was 0.78 (P<.001), indicating substantial agreement. Internal consistency of the answers provided by ChatGPT was 100%.
    ChatGPT provides medical information of comparable quality to available static internet information. Although currently of limited quality, large language models could become the future standard for patients and health care professionals to gather medical information.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    BACKGROUND: Physicians spend a lot of time in routine tasks, i.e. repetitive and time consuming tasks that are essential for the diagnostic and treatment process. One of these tasks is to collect information on the patient\'s medical history.
    OBJECTIVE: We aim at developing a prototype for an intelligent interviewer that collects the medical history of a patient before the patient-doctor encounter. From this and our previous experiences in developing similar systems, we derive recommendations for developing intelligent interviewers for concrete medical domains and tasks.
    METHODS: The intelligent interviewer was implemented as chatbot using IBM Watson assistant in close cooperation with a family doctor.
    RESULTS: AnCha is a rule-based chatbot realized as decision tree with 75 nodes. It asks a maximum of 44 questions on the medical history, current complaints and collects additional information on the patient, social details, and prevention.
    CONCLUSIONS: When developing an intelligent digital interviewer it is essential to define its concrete purpose, specify information to be collected, design the user interface, consider data security and conduct a practice-oriented evaluation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号