■从电子健康记录(EHR)中准确识别临床表型可为患者的健康状况提供更多见解,特别是当这些信息在结构化数据中不可用时。这项研究评估了OpenAI的生成预训练变压器(GPT)-4模型在非小细胞肺癌(NSCLC)患者的EHR文本中识别临床表型的应用。目标是确定疾病阶段,使用GPT-4的治疗和进展,并将其性能与GPT-3.5-turbo进行比较,Flan-T5-xl,Flan-T5-xxl,Llama-3-8B,以及2种基于规则和基于机器学习的方法,即,scispaCy和medspaCy。
■表型,如初始癌症阶段,初始治疗,癌症复发的证据,从圣路易斯华盛顿大学的63例NSCLC患者的13.646临床记录中确定了复发期间受影响的器官,密苏里州。GPT-4模型的性能与GPT-3.5-turbo进行了评估,Flan-T5-xxl,Flan-T5-xl,Llama-3-8B,medspaCy,和scisspaCy通过比较精度,召回,和micro-F1得分。
■GPT-4取得了更高的F1得分,精度,与Flan-T5-xl相比,Flan-T5-xxl,Llama-3-8B,medspaCy,和scispaCy的模型。GPT-3.5-turbo的性能类似于GPT-4。GPT,Flan-T5和Llama模型不受上下文模式识别的明确规则要求的约束。spaCy模型依赖于预定义的模式,导致他们的表现欠佳。
■GPT-4由于其强大的预训练和对嵌入令牌的显着模式识别能力而改善了临床表型识别。它展示了数据驱动的有效性,即使输入中的上下文有限。虽然基于规则的模型对某些任务仍然有用,GPT模型提供了改进的文本上下文理解,和稳健的临床表型提取。
UNASSIGNED: Accurately identifying clinical phenotypes from Electronic Health Records (EHRs) provides additional insights into patients\' health, especially when such information is unavailable in structured data. This study evaluates the application of OpenAI\'s Generative Pre-trained Transformer (GPT)-4 model to identify clinical phenotypes from EHR text in non-small cell lung cancer (NSCLC) patients. The goal was to identify disease stages, treatments and progression utilizing GPT-4, and compare its performance against GPT-3.5-turbo, Flan-T5-xl, Flan-T5-xxl, Llama-3-8B, and 2 rule-based and machine learning-based methods, namely, scispaCy and medspaCy.
UNASSIGNED: Phenotypes such as initial cancer stage, initial treatment, evidence of cancer recurrence, and affected organs during recurrence were identified from 13 646 clinical notes for 63 NSCLC patients from Washington University in St. Louis, Missouri. The performance of the GPT-4 model is evaluated against GPT-3.5-turbo, Flan-T5-xxl, Flan-T5-xl, Llama-3-8B, medspaCy, and scispaCy by comparing precision, recall, and micro-F1 scores.
UNASSIGNED: GPT-4 achieved higher F1 score, precision, and recall compared to Flan-T5-xl, Flan-T5-xxl, Llama-3-8B, medspaCy, and scispaCy\'s models. GPT-3.5-turbo performed similarly to that of GPT-4. GPT, Flan-T5, and Llama models were not constrained by explicit rule requirements for contextual pattern recognition. spaCy models relied on predefined patterns, leading to their suboptimal performance.
UNASSIGNED: GPT-4 improves clinical phenotype identification due to its robust pre-training and remarkable pattern recognition capability on the embedded tokens. It demonstrates data-driven effectiveness even with limited context in the input. While rule-based models remain useful for some tasks, GPT models offer improved contextual understanding of the text, and robust clinical phenotype extraction.