使用 AI 将生物监测移动到编码数据之外，以从医师注释中进行症状检测：回顾性队列研究。Moving Biosurveillance Beyond Coded Data Using AI for Symptom Detection From Physician Notes: Retrospective Cohort Study.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

BACKGROUND: Real-time surveillance of emerging infectious diseases necessitates a dynamically evolving, computable case definition, which frequently incorporates symptom-related criteria. For symptom detection, both population health monitoring platforms and research initiatives primarily depend on structured data extracted from electronic health records.
OBJECTIVE: This study sought to validate and test an artificial intelligence (AI)-based natural language processing (NLP) pipeline for detecting COVID-19 symptoms from physician notes in pediatric patients. We specifically study patients presenting to the emergency department (ED) who can be sentinel cases in an outbreak.
METHODS: Subjects in this retrospective cohort study are patients who are 21 years of age and younger, who presented to a pediatric ED at a large academic children\'s hospital between March 1, 2020, and May 31, 2022. The ED notes for all patients were processed with an NLP pipeline tuned to detect the mention of 11 COVID-19 symptoms based on Centers for Disease Control and Prevention (CDC) criteria. For a gold standard, 3 subject matter experts labeled 226 ED notes and had strong agreement (F1-score=0.986; positive predictive value [PPV]=0.972; and sensitivity=1.0). F1-score, PPV, and sensitivity were used to compare the performance of both NLP and the International Classification of Diseases, 10th Revision (ICD-10) coding to the gold standard chart review. As a formative use case, variations in symptom patterns were measured across SARS-CoV-2 variant eras.
RESULTS: There were 85,678 ED encounters during the study period, including 4% (n=3420) with patients with COVID-19. NLP was more accurate at identifying encounters with patients that had any of the COVID-19 symptoms (F1-score=0.796) than ICD-10 codes (F1-score =0.451). NLP accuracy was higher for positive symptoms (sensitivity=0.930) than ICD-10 (sensitivity=0.300). However, ICD-10 accuracy was higher for negative symptoms (specificity=0.994) than NLP (specificity=0.917). Congestion or runny nose showed the highest accuracy difference (NLP: F1-score=0.828 and ICD-10: F1-score=0.042). For encounters with patients with COVID-19, prevalence estimates of each NLP symptom differed across variant eras. Patients with COVID-19 were more likely to have each NLP symptom detected than patients without this disease. Effect sizes (odds ratios) varied across pandemic eras.
CONCLUSIONS: This study establishes the value of AI-based NLP as a highly effective tool for real-time COVID-19 symptom detection in pediatric patients, outperforming traditional ICD-10 methods. It also reveals the evolving nature of symptom prevalence across different virus variants, underscoring the need for dynamic, technology-driven approaches in infectious disease surveillance.

摘要：

背景：对新兴传染病的实时监测需要动态发展，可计算的案例定义，经常包含与症状相关的标准。对于症状检测，人口健康监测平台和研究计划都主要依赖于从电子健康记录中提取的结构化数据。
目的：本研究旨在验证和测试基于人工智能（AI）的自然语言处理（NLP）管道，用于检测儿科患者的医生记录中的COVID-19症状。我们专门研究到急诊科（ED）就诊的患者，这些患者可能是暴发中的前哨病例。
方法：这项回顾性队列研究的受试者是21岁及以下的患者，他在2020年3月1日至2022年5月31日期间在一家大型学术儿童医院接受儿科ED治疗。根据疾病控制和预防中心(CDC)标准，所有患者的ED注释都用NLP管道处理，以检测11种COVID-19症状的提及。对于黄金标准，3位主题专家标记了226个ED注释，并且具有很强的一致性（F1评分=0.986；阳性预测值[PPV]=0.972；灵敏度=1.0）。F1分数，PPV,和敏感性用于比较NLP和国际疾病分类的性能，第10次修订(ICD-10)编码为黄金标准图表审查。作为形成性用例，在SARS-CoV-2变种时代测量了症状模式的变化。
结果：在研究期间有85,678次ED发作，包括4%（n=3420）的COVID-19患者。NLP在识别与有任何COVID-19症状(F1评分=0.796)的患者的相遇方面比ICD-10代码(F1评分=0.451)更准确。阳性症状的NLP准确性（敏感性=0.930）高于ICD-10（敏感性=0.300）。然而,阴性症状（特异性=0.994）的ICD-10准确性高于NLP（特异性=0.917）。充血或流鼻涕显示出最高的准确性差异（NLP：F1评分=0.828，ICD-10：F1评分=0.042）。对于与COVID-19患者的接触，每种NLP症状的患病率估计在不同的时代有所不同。与没有这种疾病的患者相比，患有COVID-19的患者更有可能检测到每种NLP症状。影响大小(赔率比)在大流行时代有所不同。
结论：这项研究确立了基于AI的NLP作为儿科患者实时检测COVID-19症状的高效工具的价值，优于传统的ICD-10方法。它还揭示了不同病毒变体中症状流行的演变性质，强调了对动态的需求，传染病监测中的技术驱动方法。