关键词: Artificial intelligence ChatGPT Essential tremor Large language Model Movement disorders

Mesh : Essential Tremor / diagnosis Humans Comprehension Female Male Middle Aged Aged Adult Health Literacy

来  源:   DOI:10.5334/tohm.917   PDF(Pubmed)

Abstract:
UNASSIGNED: Large-language models (LLMs) driven by artificial intelligence allow people to engage in direct conversations about their health. The accuracy and readability of the answers provided by ChatGPT, the most famous LLM, about Essential Tremor (ET), one of the commonest movement disorders, have not yet been evaluated.
UNASSIGNED: Answers given by ChatGPT to 10 questions about ET were evaluated by 5 professionals and 15 laypeople with a score ranging from 1 (poor) to 5 (excellent) in terms of clarity, relevance, accuracy (only for professionals), comprehensiveness, and overall value of the response. We further calculated the readability of the answers.
UNASSIGNED: ChatGPT answers received relatively positive evaluations, with median scores ranging between 4 and 5, by both groups and independently from the type of question. However, there was only moderate agreement between raters, especially in the group of professionals. Moreover, readability levels were poor for all examined answers.
UNASSIGNED: ChatGPT provided relatively accurate and relevant answers, with some variability as judged by the group of professionals suggesting that the degree of literacy about ET has influenced the ratings and, indirectly, that the quality of information provided in clinical practice is also variable. Moreover, the readability of the answer provided by ChatGPT was found to be poor. LLMs will likely play a significant role in the future; therefore, health-related content generated by these tools should be monitored.
摘要:
由人工智能驱动的大语言模型(LLM)允许人们进行有关其健康的直接对话。ChatGPT提供的答案的准确性和可读性,最著名的LLM,关于原发性震颤(ET),最常见的运动障碍之一,尚未评估。
ChatGPT对有关ET的10个问题的答案由5名专业人士和15名外行人进行了评估,在清晰度方面得分从1(差)到5(优)不等,相关性,准确性(仅适用于专业人士),全面性,以及响应的整体价值。我们进一步计算了答案的可读性。
ChatGPT的回答得到了相对积极的评价,两组的中位数得分在4到5之间,独立于问题类型。然而,评价者之间只有适度的协议,尤其是在专业人群中。此外,所有被检查答案的可读性水平都很差。
ChatGPT提供了相对准确和相关的答案,有一些变化的判断,由一组专业人士表明,有关ET的识字程度影响了评级和,间接地,临床实践中提供的信息质量也是可变的。此外,ChatGPT提供的答案的可读性较差。LLM可能会在未来发挥重要作用;因此,这些工具生成的与健康相关的内容应该受到监控。
公众号