Generative AI

创成式 AI
  • 文章类型: Journal Article
    尽管最近取得了进展,计算机视觉方法在临床和商业应用中的应用受到了训练健壮的监督模型所需的精确的地面实况组织注释的有限可用性的阻碍。可以通过使用免疫荧光染色(IF)对组织进行分子注释并将这些注释映射到IFH&E(末端H&E)来加速生成这样的地面实况。在IF和终端H&E之间映射注释增加了可以生成地面实况的比例和准确度。然而,由IF组织处理引起的终端H&E与常规H&E之间的差异限制了这种实现。我们试图克服这一挑战,并使用合成图像生成实现这些并行模式之间的兼容性,其中应用了周期一致的生成对抗网络(CycleGAN)来传输常规H&E的外观,从而模拟终端H&E。这些合成仿真使我们能够训练用于终末H&E中上皮分割的深度学习(DL)模型,该模型可以针对基于上皮的细胞角蛋白的IF染色进行验证。该分割模型与CycleGAN染色转移模型的组合使得能够在常规H&E图像中进行上皮分割。该方法表明,通过利用分子注释策略(如IF,只要分子注释协议的组织影响由可以在分割过程之前部署的生成模型捕获。
    Despite recent advances, the adoption of computer vision methods into clinical and commercial applications has been hampered by the limited availability of accurate ground truth tissue annotations required to train robust supervised models. Generating such ground truth can be accelerated by annotating tissue molecularly using immunofluorescence staining (IF) and mapping these annotations to a post-IF H&E (terminal H&E). Mapping the annotations between the IF and the terminal H&E increases both the scale and accuracy by which ground truth could be generated. However, discrepancies between terminal H&E and conventional H&E caused by IF tissue processing have limited this implementation. We sought to overcome this challenge and achieve compatibility between these parallel modalities using synthetic image generation, in which a cycle-consistent generative adversarial network (CycleGAN) was applied to transfer the appearance of conventional H&E such that it emulates the terminal H&E. These synthetic emulations allowed us to train a deep learning (DL) model for the segmentation of epithelium in the terminal H&E that could be validated against the IF staining of epithelial-based cytokeratins. The combination of this segmentation model with the CycleGAN stain transfer model enabled performative epithelium segmentation in conventional H&E images. The approach demonstrates that the training of accurate segmentation models for the breadth of conventional H&E data can be executed free of human-expert annotations by leveraging molecular annotation strategies such as IF, so long as the tissue impacts of the molecular annotation protocol are captured by generative models that can be deployed prior to the segmentation process.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:在未来十年内接受多发性硬化症(MS)诊断的患者将主要是Z世代(GenZ)的一部分。我们诊所的最新观察表明,患有MS的年轻人在首次访问神经免疫学专家之前,利用在线生成人工智能(AI)平台提供个性化医疗建议。鉴于技术驱动的性质,预计此类平台的使用将会增加,渴望即时通信,我们的目标是确定ChatGPT(GenerativePre-trainedTransformer)是否可以比他们的临床时间表更早地诊断个体的MS,并评估准确性是否因年龄而异,性别,和种族/民族。
    方法:研究年龄在18至59岁之间的MS患者。使用ChatGPT-3.5(GPT-3.5)对被诊断为MS的人的临床时间线进行回顾性鉴定和模拟。聊天是使用他们年龄的实际和衍生物进行的,性别,和种族/民族来测试诊断准确性。估计了Kaplan-Meier生存曲线的诊断时间,按主题聚集。使用一般Wilcoxon检验完成对诊断时间差异的p值检验。使用逻辑回归(受试者特异性截距)来捕获受试者内相关性,以测试包括MRI数据之前和之后的准确性。
    结果:研究队列包括100名独特的MS患者。其中,50名Z世代成员(38名女性;22名白人;首发症状的平均年龄为20.6岁(y)(标准差(SD)=2.2y)),50名非Z世代(34名女性;27名白人;首次出现症状的平均年龄为37.0y(SD=10.4y))。此外,总共529人代表了最初100人的数字模拟(333名女性;166名白人;136名黑人/非洲裔美国人;107名亚洲人;120名西班牙裔,首次出现症状的平均年龄为31.6岁(SD=12.4岁)),允许对629个脚本对话进行分析.临床诊断的估计中位时间为0.35y(95%CI=[0.28,0.48]),而ChatGPT为0.08y(95%CI=[0.04,0.24])(p<0.0001)。在纳入MRI数据之前,年龄之间和种族/种族之间的诊断准确性没有差异。然而,在包括MRI数据之前,与女性相比,男性获得正确诊断的可能性低47%(p=0.05).将MRI数据纳入GPT-3.5后,Z世代参与者的准确诊断几率增加了4.0倍,相对于非Z世代参与者(p=0.01),男性相对于女性(p=0.009)的诊断准确性降低了68%,白人受试者减少了75%,相对于非白人受试者(p=0.0004)。
    结论:尽管生成AI平台能够实现快速的信息访问,并且主要不是为医疗保健而设计的,预计Z世代的使用量会增加。然而,获得的响应可能无法推广到所有用户,并且在选定的组中可能存在偏见。
    BACKGROUND: Those receiving the diagnosis of multiple sclerosis (MS) over the next ten years will predominantly be part of Generation Z (Gen Z). Recent observations within our clinic suggest that younger people with MS utilize online generative artificial intelligence (AI) platforms for personalized medical advice prior to their first visit with a specialist in neuroimmunology. The use of such platforms is anticipated to increase given the technology driven nature, desire for instant communication, and cost-conscious nature of Gen Z. Our objective was to determine if ChatGPT (Generative Pre-trained Transformer) could diagnose MS in individuals earlier than their clinical timeline, and to assess if the accuracy differed based on age, sex, and race/ethnicity.
    METHODS: People with MS between 18 and 59 years of age were studied. The clinical timeline for people diagnosed with MS was retrospectively identified and simulated using ChatGPT-3.5 (GPT-3.5). Chats were conducted using both actual and derivatives of their age, sex, and race/ethnicity to test diagnostic accuracy. A Kaplan-Meier survival curve was estimated for time to diagnosis, clustered by subject. The p-value testing for differences in time to diagnosis was accomplished using a general Wilcoxon test. Logistic regression (subject-specific intercept) was used to capture intra-subject correlation to test the accuracy prior to and after the inclusion of MRI data.
    RESULTS: The study cohort included 100 unique people with MS. Of those, 50 were members of Gen Z (38 female; 22 White; mean age at first symptom was 20.6 years (y) (standard deviation (SD)=2.2y)), and 50 were non-Gen Z (34 female; 27 White; mean age at first symptom was 37.0y (SD=10.4y)). In addition, a total of 529 people that represented digital simulations of the original cohort of 100 people (333 female; 166 White; 136 Black/African American; 107 Asian; 120 Hispanic, mean age at first symptom was 31.6y (SD=12.4y)) were generated allowing for 629 scripted conversations to be analyzed. The estimated median time to diagnosis in clinic was significantly longer at 0.35y (95% CI=[0.28, 0.48]) versus that by ChatGPT at 0.08y (95% CI=[0.04, 0.24]) (p<0.0001). There was no difference in the diagnostic accuracy between ages and by race/ethnicity prior to the inclusion of MRI data. However, prior to including the MRI data, males had a 47% less likely chance of a correct diagnosis relative to females (p=0.05). Post-MRI data inclusion within GPT-3.5, the odds of an accurate diagnosis was 4.0-fold greater for Gen Z participants, relative to non-Gen Z participants (p=0.01) with the diagnostic accuracy being 68% less in males relative to females (p=0.009), and 75% less for White subjects, relative to non-White subjects (p=0.0004).
    CONCLUSIONS: Although generative AI platforms enable rapid information access and are not principally designed for use in healthcare, an increase in use by Gen Z is anticipated. However, the obtained responses may not be generalizable to all users and bias may exist in select groups.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:评估人工智能生成的医疗案例的准确性和教育效用,特别是由ChatGPT-4(由OpenAI开发)等大型语言模型生成的模型,是至关重要的,但未被充分开发。
    目的:本研究旨在评估ChatGPT-4生成的临床小插曲的教育效用及其在教育环境中的适用性。
    方法:使用收敛混合方法设计,2024年1月8日至28日进行了一项基于网络的调查,以评估ChatGPT-4在日语中产生的18例医疗病例.在调查中,使用6个主要问题项目来评估生成的临床小插曲的质量及其教育效用,这是信息质量,信息准确性,教育有用性,临床匹配,术语准确性(TA),和诊断困难。反馈是由专门从事普通内科或普通医学并且在医学教育方面经验丰富的医生征求的。进行卡方检验和Mann-WhitneyU检验以确定病例之间的差异,线性回归用于检查与医师经验相关的趋势。对定性反馈进行了主题分析,以确定需要改进的地方并确认案例的教育效用。
    结果:在邀请的73名参与者中,71(97%)回答。受访者,主要是男性(64/71,90%),跨越广泛的实践年(从1976年到2017年),并代表了日本各地不同的医院规模。大多数人认为信息质量(平均0.77,95%CI0.75-0.79)和信息准确性(平均0.68,95%CI0.65-0.71)令人满意,这些响应基于二进制数据。教育有用性的平均分数为3.55(95%CI3.49-3.60),临床匹配为3.70(95%CI3.65-3.75),TA的3.49(95%CI3.44-3.55),诊断难度为2.34(95%CI2.28-2.40),基于5分的李克特量表。统计学分析显示,不同病例的内容质量和相关性存在显著差异(Bonferroni校正后P<.001)。参与者建议改善身体发现,使用自然语言,增强医学TA。专题分析强调需要更清晰的文件,临床信息一致性,内容相关性,和以病人为中心的病例介绍。
    结论:ChatGPT-4生成的日语医学案例作为医学教育资源具有相当大的潜力,在质量和准确性方面具有公认的充分性。然而,有一个显著的需要,以提高精度和真实性的情况下的细节。本研究强调了ChatGPT-4作为医学领域辅助教育工具的价值,需要专家监督才能实现最佳应用。
    BACKGROUND: Evaluating the accuracy and educational utility of artificial intelligence-generated medical cases, especially those produced by large language models such as ChatGPT-4 (developed by OpenAI), is crucial yet underexplored.
    OBJECTIVE: This study aimed to assess the educational utility of ChatGPT-4-generated clinical vignettes and their applicability in educational settings.
    METHODS: Using a convergent mixed methods design, a web-based survey was conducted from January 8 to 28, 2024, to evaluate 18 medical cases generated by ChatGPT-4 in Japanese. In the survey, 6 main question items were used to evaluate the quality of the generated clinical vignettes and their educational utility, which are information quality, information accuracy, educational usefulness, clinical match, terminology accuracy (TA), and diagnosis difficulty. Feedback was solicited from physicians specializing in general internal medicine or general medicine and experienced in medical education. Chi-square and Mann-Whitney U tests were performed to identify differences among cases, and linear regression was used to examine trends associated with physicians\' experience. Thematic analysis of qualitative feedback was performed to identify areas for improvement and confirm the educational utility of the cases.
    RESULTS: Of the 73 invited participants, 71 (97%) responded. The respondents, primarily male (64/71, 90%), spanned a broad range of practice years (from 1976 to 2017) and represented diverse hospital sizes throughout Japan. The majority deemed the information quality (mean 0.77, 95% CI 0.75-0.79) and information accuracy (mean 0.68, 95% CI 0.65-0.71) to be satisfactory, with these responses being based on binary data. The average scores assigned were 3.55 (95% CI 3.49-3.60) for educational usefulness, 3.70 (95% CI 3.65-3.75) for clinical match, 3.49 (95% CI 3.44-3.55) for TA, and 2.34 (95% CI 2.28-2.40) for diagnosis difficulty, based on a 5-point Likert scale. Statistical analysis showed significant variability in content quality and relevance across the cases (P<.001 after Bonferroni correction). Participants suggested improvements in generating physical findings, using natural language, and enhancing medical TA. The thematic analysis highlighted the need for clearer documentation, clinical information consistency, content relevance, and patient-centered case presentations.
    CONCLUSIONS: ChatGPT-4-generated medical cases written in Japanese possess considerable potential as resources in medical education, with recognized adequacy in quality and accuracy. Nevertheless, there is a notable need for enhancements in the precision and realism of case details. This study emphasizes ChatGPT-4\'s value as an adjunctive educational tool in the medical field, requiring expert oversight for optimal application.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:探索生成AI的整合,特别是大型语言模型(LLM),在眼科教育和实践中,解决他们的应用,好处,挑战,和未来的方向。
    方法:对当前AI在眼科中的应用和教育计划进行文献回顾和分析。
    方法:对已发表研究的分析,reviews,文章,网站,以及有关AI在眼科中使用的机构报告。结合AI的教育计划的检查,包括课程框架,训练方法,以及人工智能在医学检查和临床案例研究中的表现评价。
    结果:生成AI,特别是LLM,显示出提高眼科诊断准确性和患者护理的潜力。应用包括帮助患者,内科医生,和医学生的教育。然而,诸如人工智能幻觉之类的挑战,偏见,缺乏可解释性,和过时的培训数据限制了临床部署。研究表明,眼科委员会考试问题的LLM准确性不同,强调需要更可靠的人工智能集成。全国范围内的一些教育计划提供与临床医学和眼科相关的AI和数据科学培训。
    结论:生成AI和LLM在眼科教育和实践方面提供了有希望的进步。通过包括基本人工智能原则的综合课程应对挑战,道德准则,并更新,无偏见的训练数据至关重要。未来的方向包括开发临床相关的评估指标,在人为监督下实施混合模型,利用图像丰富的数据,并将AI性能与眼科医生进行基准测试。关于数据隐私的强有力的政策,安全,和透明度对于促进AI在眼科应用的安全和道德环境至关重要。
    OBJECTIVE: To explore the integration of generative AI, specifically large language models (LLMs), in ophthalmology education and practice, addressing their applications, benefits, challenges, and future directions.
    METHODS: A literature review and analysis of current AI applications and educational programs in ophthalmology.
    METHODS: Analysis of published studies, reviews, articles, websites, and institutional reports on AI use in ophthalmology. Examination of educational programs incorporating AI, including curriculum frameworks, training methodologies, and evaluations of AI performance on medical examinations and clinical case studies.
    RESULTS: Generative AI, particularly LLMs, shows potential to improve diagnostic accuracy and patient care in ophthalmology. Applications include aiding in patient, physician, and medical students\' education. However, challenges such as AI hallucinations, biases, lack of interpretability, and outdated training data limit clinical deployment. Studies revealed varying levels of accuracy of LLMs on ophthalmology board exam questions, underscoring the need for more reliable AI integration. Several educational programs nationwide provide AI and data science training relevant to clinical medicine and ophthalmology.
    CONCLUSIONS: Generative AI and LLMs offer promising advancements in ophthalmology education and practice. Addressing challenges through comprehensive curricula that include fundamental AI principles, ethical guidelines, and updated, unbiased training data is crucial. Future directions include developing clinically relevant evaluation metrics, implementing hybrid models with human oversight, leveraging image-rich data, and benchmarking AI performance against ophthalmologists. Robust policies on data privacy, security, and transparency are essential for fostering a safe and ethical environment for AI applications in ophthalmology.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    生成人工智能(GenAI)让卫生职业教育(HPE)机构措手不及,他们目前正在适应变化的教育环境。在地平线上,然而,是人工通用智能(AGI),它有望成为更大的飞跃和挑战。本指南首先解释了AGI的背景和性质,包括它的多模态特征,一般性,适应性,自主性,和学习能力。然后探讨了AGI对学生(包括个性化学习和电子导师)和HPE机构的影响,并考虑了AGI在医疗保健领域提供的一些背景。然后提出了需要解决的问题,包括对就业的影响,社会风险,学生适应性,成本,质量,和其他人。在考虑了可能的时间表之后,然后,该指南以指出HPE机构和教育工作者可以采取的准备AGI的一些第一步。
    Generative Artificial Intelligence (GenAI) caught Health Professions Education (HPE) institutions off-guard, and they are currently adjusting to a changed educational environment. On the horizon, however, is Artificial General Intelligence (AGI) which promises to be an even greater leap and challenge. This Guide begins by explaining the context and nature of AGI, including its characteristics of multi-modality, generality, adaptability, autonomy, and learning ability. It then explores the implications of AGI on students (including personalised learning and electronic tutors) and HPE institutions, and considers some of the context provided by AGI in healthcare. It then raises the problems to address, including the impact on employment, social risks, student adaptability, costs, quality, and others. After considering a possible timeline, the Guide then ends by indicating some first steps that HPE institutions and educators can take to prepare for AGI.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    全球精神健康担忧率正在上升,人们越来越认识到,现有的精神卫生保健模式将无法充分扩展以满足需求。随着大型语言模型(LLM)的出现,他们对创造小说的承诺非常乐观,支持心理健康的大规模解决方案。尽管他们的巢穴,LLM已经应用于与心理健康相关的任务。在本文中,我们总结了现有文献中关于使用LLM提供心理健康教育的努力,评估,和干预,并强调在每个领域产生积极影响的关键机会。然后,我们强调与LLM应用于心理健康相关的风险,并鼓励采取策略来减轻这些风险。对精神卫生支持的迫切需要必须与负责任的发展相平衡,测试,anddeploymentofmentalhealthLLM.ItisespeciallycriticaltoensurethatmentalhealthLLMarefine-tunedformentalhealth,加强心理健康公平,并遵守道德标准,包括那些有精神健康问题的人,参与从开发到部署的所有阶段。优先考虑这些努力将最大限度地减少对心理健康的潜在危害,并最大限度地提高LLM对全球心理健康产生积极影响的可能性。
    UNASSIGNED: Global rates of mental health concerns are rising, and there is increasing realization that existing models of mental health care will not adequately expand to meet the demand. With the emergence of large language models (LLMs) has come great optimism regarding their promise to create novel, large-scale solutions to support mental health. Despite their nascence, LLMs have already been applied to mental health-related tasks. In this paper, we summarize the extant literature on efforts to use LLMs to provide mental health education, assessment, and intervention and highlight key opportunities for positive impact in each area. We then highlight risks associated with LLMs\' application to mental health and encourage the adoption of strategies to mitigate these risks. The urgent need for mental health support must be balanced with responsible development, testing, and deployment of mental health LLMs. It is especially critical to ensure that mental health LLMs are fine-tuned for mental health, enhance mental health equity, and adhere to ethical standards and that people, including those with lived experience with mental health concerns, are involved in all stages from development through deployment. Prioritizing these efforts will minimize potential harms to mental health and maximize the likelihood that LLMs will positively impact mental health globally.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人口统计,健康的社会决定因素,越来越多地研究电子健康记录中的非结构化文本中记录的家族史,以了解如何将这些信息与结构化数据一起使用以改善医疗保健结果。GPT模型发布后,许多研究已经应用GPT模型从叙述性临床笔记中提取这些信息。不同于现有的工作,我们的研究重点是通过向GPT模型提供最少的信息来研究在一起提取这些信息时的零镜头学习.我们利用针对人口统计注释的去识别的真实世界临床笔记,各种社会决定因素,和家族史信息。鉴于GPT模型可能提供与原始数据中的文本不同的文本,我们探索了两组评估指标,包括传统的NER评价指标和语义相似度评价指标,完全理解表演。我们的结果表明,GPT-3.5方法在人口统计学提取上平均达到0.975F1,关于社会决定因素提取的0.615F1,家族史提取0.722F1。我们相信这些结果可以通过模型微调或少量学习得到进一步改善。通过案例研究,我们还确定了GPT模型的局限性,这需要在未来的研究中解决。
    Demographics, social determinants of health, and family history documented in the unstructured text within the electronic health records are increasingly being studied to understand how this information can be utilized with the structured data to improve healthcare outcomes. After the GPT models were released, many studies have applied GPT models to extract this information from the narrative clinical notes. Different from the existing work, our research focuses on investigating the zero-shot learning on extracting this information together by providing minimum information to the GPT model. We utilize de-identified real-world clinical notes annotated for demographics, various social determinants, and family history information. Given that the GPT model might provide text different from the text in the original data, we explore two sets of evaluation metrics, including the traditional NER evaluation metrics and semantic similarity evaluation metrics, to completely understand the performance. Our results show that the GPT-3.5 method achieved an average of 0.975 F1 on demographics extraction, 0.615 F1 on social determinants extraction, and 0.722 F1 on family history extraction. We believe these results can be further improved through model fine-tuning or few-shots learning. Through the case studies, we also identified the limitations of the GPT models, which need to be addressed in future research.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究的目的是探索学生对健康专业教育中生成性AI使用和作弊的看法。作者试图了解学生如何认为生成AI可以在课程中使用。
    五名教职员工使用最新的数据对卫生专业研究生课程的学生进行了调查,经过验证的调查仪器。学生匿名在线完成调查,花了10-20分钟。然后将数据制成表格并以汇总形式报告。
    来自12个学术课程的近400名学生,包括健康和康复科学,职业治疗,物理治疗,医师助理研究,言语-语言病理学,卫生管理和卫生信息学,本科医疗保健研究,护士麻醉学,和心血管灌注。大多数学生认识到生成AI对测试和论文等分级作业的威胁,但是许多人认为使用这些工具在分级作业之外进行学习和研究是可以接受的。
    生成AI工具为学生学习和学习提供了新的选择。卫生专业的研究生目前正在使用生成AI应用程序,但并不普遍意识到或同意其使用如何威胁学术诚信。教师应提供有关如何使用生成AI应用程序的具体指导。
    UNASSIGNED: The purpose of this study is to explore student perceptions of generative AI use and cheating in health professions education. The authors sought to understand how students believe generative AI is acceptable to use in coursework.
    UNASSIGNED: Five faculty members surveyed students across health professions graduate programs using an updated, validated survey instrument. Students anonymously completed the survey online, which took 10-20 min. Data were then tabulated and reported in aggregate form.
    UNASSIGNED: Nearly 400 students from twelve academic programs including health and rehabilitation science, occupational therapy, physical therapy, physician assistant studies, speech-language pathology, health administration and health informatics, undergraduate healthcare studies, nurse anesthesiology, and cardiovascular perfusion. The majority of students identify the threat of generative AI to graded assignments such as tests and papers, but many believe it is acceptable to use these tools to learn and study outside of graded assignments.
    UNASSIGNED: Generative AI tools provide new options for students to study and learn. Graduate students in the health professions are currently using generative AI applications but are not universally aware or in agreement of how its use threatens academic integrity. Faculty should provide specific guidance on how generative AI applications may be used.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Letter
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    虽然个别罕见疾病的患病率较低,它们共同影响着全世界近4亿人。平均而言,准确的罕见疾病诊断需要五年时间,但是许多患者仍未被诊断或误诊。随着机器学习技术在过去被用于辅助诊断,本研究旨在通过检索增强世代(RAG)提供的增强功能,检验ChatGPT对罕见疾病诊断支持的适用性.RareDxGPT,我们增强的ChatGPT模型,从外部知识资源向ChatGPT提供有关717种罕见疾病的信息,RareDis语料库,通过RAG。在RareDxGPT中,当输入查询时,检索RareDis语料库中与查询最相关的三个文档。随着查询,他们被送回ChatGPT提供诊断。此外,从PubMed病例报告的自由文本中提取了30种不同疾病的表型。它们分别输入了三种不同的提示类型:“提示”,\"提示+解释\"和\"提示+角色扮演。“然后测量ChatGPT和RareDxGPT在每个提示下的准确性。带有\"提示\",RareDxGPT有40%的准确率,而ChatGPT3.5则有37%的病例是正确的。带有\"提示+解释\",RareDxGPT的准确率为43%,而ChatGPT3.5得到23%的病例正确。带有\"提示+角色扮演\",RareDxGPT有40%的准确率,而ChatGPT3.5得到23%的病例正确。最后,ChatGPT,特别是在提供额外的领域特定知识时,通过调整显示罕见疾病诊断的早期潜力。
    Although rare diseases individually have a low prevalence, they collectively affect nearly 400 million individuals around the world. On average, it takes five years for an accurate rare disease diagnosis, but many patients remain undiagnosed or misdiagnosed. As machine learning technologies have been used to aid diagnostics in the past, this study aims to test ChatGPT\'s suitability for rare disease diagnostic support with the enhancement provided by Retrieval Augmented Generation (RAG). RareDxGPT, our enhanced ChatGPT model, supplies ChatGPT with information about 717 rare diseases from an external knowledge resource, the RareDis Corpus, through RAG. In RareDxGPT, when a query is entered, the three documents most relevant to the query in the RareDis Corpus are retrieved. Along with the query, they are returned to ChatGPT to provide a diagnosis. Additionally, phenotypes for thirty different diseases were extracted from free text from PubMed\'s Case Reports. They were each entered with three different prompt types: \"prompt\", \"prompt + explanation\" and \"prompt + role play.\" The accuracy of ChatGPT and RareDxGPT with each prompt was then measured. With \"Prompt\", RareDxGPT had a 40 % accuracy, while ChatGPT 3.5 got 37 % of the cases correct. With \"Prompt + Explanation\", RareDxGPT had a 43 % accuracy, while ChatGPT 3.5 got 23 % of the cases correct. With \"Prompt + Role Play\", RareDxGPT had a 40 % accuracy, while ChatGPT 3.5 got 23 % of the cases correct. To conclude, ChatGPT, especially when supplying extra domain specific knowledge, demonstrates early potential for rare disease diagnosis with adjustments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号