关键词: Artificial intelligence Bard Bing ChatGPT Chatbot Copilot Erectile dysfunction Ernie bot

Mesh : Male Humans Artificial Intelligence Erectile Dysfunction Software Benchmarking Linguistics

来  源:   DOI:10.1007/s10916-024-02056-0   PDF(Pubmed)

Abstract:
The aim of the study is to evaluate and compare the quality and readability of responses generated by five different artificial intelligence (AI) chatbots-ChatGPT, Bard, Bing, Ernie, and Copilot-to the top searched queries of erectile dysfunction (ED). Google Trends was used to identify ED-related relevant phrases. Each AI chatbot received a specific sequence of 25 frequently searched terms as input. Responses were evaluated using DISCERN, Ensuring Quality Information for Patients (EQIP), and Flesch-Kincaid Grade Level (FKGL) and Reading Ease (FKRE) metrics. The top three most frequently searched phrases were \"erectile dysfunction cause\", \"how to erectile dysfunction,\" and \"erectile dysfunction treatment.\" Zimbabwe, Zambia, and Ghana exhibited the highest level of interest in ED. None of the AI chatbots achieved the necessary degree of readability. However, Bard exhibited significantly higher FKRE and FKGL ratings (p = 0.001), and Copilot achieved better EQIP and DISCERN ratings than the other chatbots (p = 0.001). Bard exhibited the simplest linguistic framework and posed the least challenge in terms of readability and comprehension, and Copilot\'s text quality on ED was superior to the other chatbots. As new chatbots are introduced, their understandability and text quality increase, providing better guidance to patients.
摘要:
该研究的目的是评估和比较五种不同的人工智能(AI)聊天机器人-ChatGPT产生的响应的质量和可读性,巴德,宾,厄尼,和Copilot-到顶部搜索的勃起功能障碍(ED)的查询。Google趋势被用来识别与ED相关的短语。每个AI聊天机器人都会收到一个由25个经常搜索的术语组成的特定序列作为输入。使用DISCERN评估反应,确保患者的质量信息(EQIP),和Flesch-Kincaid等级(FKGL)和阅读容易(FKRE)指标。搜索频率最高的前三个短语是“勃起功能障碍原因”,“如何勃起功能障碍,“和”勃起功能障碍治疗。\"津巴布韦,赞比亚,加纳对ED的兴趣最高。没有一个AI聊天机器人达到了必要的可读性。然而,巴德表现出显著更高的FKRE和FKGL评级(p=0.001),与其他聊天机器人相比,Copilot获得了更好的EQIP和DISCERN评级(p=0.001)。巴德表现出最简单的语言框架,在可读性和可理解性方面提出的挑战最小,而Copilot在ED上的文本质量优于其他聊天机器人。随着新的聊天机器人的引入,它们的可理解性和文本质量提高,为患者提供更好的指导。
公众号