Chatbots

聊天机器人
  • 文章类型: Journal Article
    背景:患者发现技术工具更容易获取敏感的健康相关信息,如生殖健康信息。人工智能(AI)聊天机器人的创造性对话能力,比如ChatGPT,为患者提供了一种潜在的方法,可以在线有效地找到与健康相关的问题的答案。
    目的:进行了一项初步研究,将新型ChatGPT与现有的Google搜索技术进行比较,有效,以及关于在错过口服避孕药(OCP)剂量后继续行动的最新信息。
    方法:十一个问题的序列,模仿患者在错过一定剂量的OCP后询问要采取的行动,作为级联输入到ChatGPT中,考虑到ChatGPT的会话能力。这些问题被输入到四个不同的ChatGPT帐户中,帐户持有人具有各种人口统计特征,评估给予不同账户持有人的答复中的潜在差异和偏见。最主要的问题,“如果我错过了一天的口服避孕药,我该怎么办?”然后将其单独输入到Google搜索中,考虑到它的非对话性质。ChatGPT问题的结果和Google搜索结果对主要问题的可读性进行了评估,准确度,和有效的信息传递。
    结果:ChatGPT结果被确定为整体较高年级阅读水平,更长的读取持续时间(表2),不太准确,较小的电流,和一个不太有效的信息传递。相比之下,谷歌搜索结果答案框和片段处于较低的阅读水平,较短的阅读持续时间,电流更大,能够参考信息的来源(透明),并提供了除文本之外的各种格式的信息。
    结论:ChatGPT在准确性方面还有改进的空间,透明度,最近,和可靠性之前,它可以公平地实施到医疗保健信息交付,并提供潜在的好处,它带来。然而,AI可以用作提供者优先教育患者的工具,创造性,和有效的方法,例如使用AI从医疗保健提供者审查的信息中生成可访问的短教育视频。需要代表不同用户群的更大研究。
    背景:
    BACKGROUND: Patients find technology tools to be more approachable for seeking sensitive health-related information, such as reproductive health information. The inventive conversational ability of artificial intelligence (AI) chatbots, such as ChatGPT (OpenAI Inc), offers a potential means for patients to effectively locate answers to their health-related questions digitally.
    OBJECTIVE: A pilot study was conducted to compare the novel ChatGPT with the existing Google Search technology for their ability to offer accurate, effective, and current information regarding proceeding action after missing a dose of oral contraceptive pill.
    METHODS: A sequence of 11 questions, mimicking a patient inquiring about the action to take after missing a dose of an oral contraceptive pill, were input into ChatGPT as a cascade, given the conversational ability of ChatGPT. The questions were input into 4 different ChatGPT accounts, with the account holders being of various demographics, to evaluate potential differences and biases in the responses given to different account holders. The leading question, \"what should I do if I missed a day of my oral contraception birth control?\" alone was then input into Google Search, given its nonconversational nature. The results from the ChatGPT questions and the Google Search results for the leading question were evaluated on their readability, accuracy, and effective delivery of information.
    RESULTS: The ChatGPT results were determined to be at an overall higher-grade reading level, with a longer reading duration, less accurate, less current, and with a less effective delivery of information. In contrast, the Google Search resulting answer box and snippets were at a lower-grade reading level, shorter reading duration, more current, able to reference the origin of the information (transparent), and provided the information in various formats in addition to text.
    CONCLUSIONS: ChatGPT has room for improvement in accuracy, transparency, recency, and reliability before it can equitably be implemented into health care information delivery and provide the potential benefits it poses. However, AI may be used as a tool for providers to educate their patients in preferred, creative, and efficient ways, such as using AI to generate accessible short educational videos from health care provider-vetted information. Larger studies representing a diverse group of users are needed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:大型语言模型(LLM),例如ChatGPT(开放式AI),越来越多地用于医学和补充标准搜索引擎作为信息源。这导致了关于个人医疗症状的LLM的更多“咨询”。
    目的:本研究旨在评估ChatGPT在回答耳鼻咽喉科(ORL)临床病例问题方面的表现,并与ORL顾问的回答进行比较。
    方法:我们使用了41个基于案例的问题,这些问题来自已建立的ORL研究书籍和过去的德国州医生考试。ORL顾问和ChatGPT3都回答了这些问题。ORL顾问对所有回复进行了评级,除了自己的,关于医疗充分性,简洁,连贯性,和可理解性使用6点Likert量表。他们还确定(在盲区)答案是否由ORL顾问或ChatGPT创建。此外,比较了字符计数。由于技术的快速发展,通过对ChatGPT3和ChatGPT4产生的反应进行比较,以深入了解LLM的发展潜力。
    结果:ORL顾问在所有类别中的评分均显著较高(P<.001)。尽管低于ORL顾问的分数,ChatGPT的分数在语义类别中相对较高(简洁性,连贯性,和可理解性)与医疗充分性相比。ORL顾问在98.4%(121/123)的病例中正确确定了ChatGPT为来源。与ORL顾问相比,ChatGPT的答案具有明显更高的字符数(P<.001)。ChatGPT3和ChatGPT4产生的响应之间的比较显示,医疗准确性略有提高,所提供的答案也有更好的连贯性。相反,尽管字符的平均数量显着增加了52.5%(n=(1470-964)/964;P<.001),但简洁性(P=.06)和可理解性(P=.08)均未显着改善。
    结论:虽然ChatGPT为医疗问题提供了更长的答案,与ORL顾问的答案相比,医疗充分性和简洁性明显较低。LLM有潜力作为医疗保健的增强工具,但是他们对医疗问题的“咨询”具有很高的错误信息风险,因为他们的高语义质量可能掩盖上下文缺陷。
    BACKGROUND: Large language models (LLMs), such as ChatGPT (Open AI), are increasingly used in medicine and supplement standard search engines as information sources. This leads to more \"consultations\" of LLMs about personal medical symptoms.
    OBJECTIVE: This study aims to evaluate ChatGPT\'s performance in answering clinical case-based questions in otorhinolaryngology (ORL) in comparison to ORL consultants\' answers.
    METHODS: We used 41 case-based questions from established ORL study books and past German state examinations for doctors. The questions were answered by both ORL consultants and ChatGPT 3. ORL consultants rated all responses, except their own, on medical adequacy, conciseness, coherence, and comprehensibility using a 6-point Likert scale. They also identified (in a blinded setting) if the answer was created by an ORL consultant or ChatGPT. Additionally, the character count was compared. Due to the rapidly evolving pace of technology, a comparison between responses generated by ChatGPT 3 and ChatGPT 4 was included to give an insight into the evolving potential of LLMs.
    RESULTS: Ratings in all categories were significantly higher for ORL consultants (P<.001). Although inferior to the scores of the ORL consultants, ChatGPT\'s scores were relatively higher in semantic categories (conciseness, coherence, and comprehensibility) compared to medical adequacy. ORL consultants identified ChatGPT as the source correctly in 98.4% (121/123) of cases. ChatGPT\'s answers had a significantly higher character count compared to ORL consultants (P<.001). Comparison between responses generated by ChatGPT 3 and ChatGPT 4 showed a slight improvement in medical accuracy as well as a better coherence of the answers provided. Contrarily, neither the conciseness (P=.06) nor the comprehensibility (P=.08) improved significantly despite the significant increase in the mean amount of characters by 52.5% (n= (1470-964)/964; P<.001).
    CONCLUSIONS: While ChatGPT provided longer answers to medical problems, medical adequacy and conciseness were significantly lower compared to ORL consultants\' answers. LLMs have potential as augmentative tools for medical care, but their \"consultation\" for medical problems carries a high risk of misinformation as their high semantic quality may mask contextual deficits.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:儿童和青少年心理健康问题的患病率增长速度快于可用的服务数量,导致短缺。心理健康聊天机器人是解决这一差距的一种高度可扩展的方法。管理你的生活在线(MYLO)是一个人工智能聊天机器人,模拟水平治疗的方法。水平方法是一种使用好奇的提问来支持对当前问题的持续认识和探索的疗法。
    目的:本研究旨在评估MYLO在16至24岁有心理健康问题的年轻人中共同设计界面的可行性和可接受性。
    方法:进行了4个月的迭代协同设计阶段,其中反馈来自一群有心理健康问题经历的年轻人(n=7)。这导致了可以在移动电话上使用的MYLO的渐进式Web应用程序版本的开发。我们进行了一系列病例,以评估13名年轻人在2周内使用MYLO的可行性和可接受性。在此期间,参与者测试了MYLO,并完成了包括临床结局和可接受性指标在内的调查.然后,我们进行了焦点小组和访谈,并使用主题分析来获得有关MYLO的反馈,并确定进一步改进的建议。
    结果:大多数参与者对使用MYLO的经验持肯定态度,并将MYLO推荐给其他人。参与者喜欢界面的简单性,发现它易于使用,并使用系统可用性量表将其评为可接受。对使用数据的检查发现,MYLO可以学习并适应其询问以响应用户输入的证据。我们发现,在测试阶段,参与者与问题相关的痛苦减少的效应大小很大,而在他们自我报告的解决目标冲突的倾向(拟议的改变机制)增加的效应大小中等。在2周内,一些患者的临床结果指标也发生了可靠的变化。
    结论:我们确定了MYLO的可行性和可接受性。初步结果表明,MYLO有可能支持年轻人的心理健康并帮助他们解决自己的问题。我们的目标是确定使用MYLO是否导致参与者的抑郁和焦虑症状有意义的减少,以及这些症状是否随着时间的推移而保持通过进行随机对照评估试验。
    BACKGROUND: The prevalence of child and adolescent mental health issues is increasing faster than the number of services available, leading to a shortfall. Mental health chatbots are a highly scalable method to address this gap. Manage Your Life Online (MYLO) is an artificially intelligent chatbot that emulates the method of levels therapy. Method of levels is a therapy that uses curious questioning to support the sustained awareness and exploration of current problems.
    OBJECTIVE: This study aimed to assess the feasibility and acceptability of a co-designed interface for MYLO in young people aged 16 to 24 years with mental health problems.
    METHODS: An iterative co-design phase occurred over 4 months, in which feedback was elicited from a group of young people (n=7) with lived experiences of mental health issues. This resulted in the development of a progressive web application version of MYLO that could be used on mobile phones. We conducted a case series to assess the feasibility and acceptability of MYLO in 13 young people over 2 weeks. During this time, the participants tested MYLO and completed surveys including clinical outcomes and acceptability measures. We then conducted focus groups and interviews and used thematic analysis to obtain feedback on MYLO and identify recommendations for further improvements.
    RESULTS: Most participants were positive about their experience of using MYLO and would recommend MYLO to others. The participants enjoyed the simplicity of the interface, found it easy to use, and rated it as acceptable using the System Usability Scale. Inspection of the use data found evidence that MYLO can learn and adapt its questioning in response to user input. We found a large effect size for the decrease in participants\' problem-related distress and a medium effect size for the increase in their self-reported tendency to resolve goal conflicts (the proposed mechanism of change) in the testing phase. Some patients also experienced a reliable change in their clinical outcome measures over the 2 weeks.
    CONCLUSIONS: We established the feasibility and acceptability of MYLO. The initial outcomes suggest that MYLO has the potential to support the mental health of young people and help them resolve their own problems. We aim to establish whether the use of MYLO leads to a meaningful reduction in participants\' symptoms of depression and anxiety and whether these are maintained over time by conducting a randomized controlled evaluation trial.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号