■诊断代码和处方数据用于识别带状疱疹后遗神经痛(PHN)的算法中,带状疱疹(HZ)的衰弱并发症。由于代码和处方数据的准确性令人怀疑,手动图表审查有时用于识别电子健康记录(EHR)中的PHN,这可能是昂贵且耗时的。
■本研究旨在开发和验证一种自然语言处理(NLP)算法,用于从非结构化EHR数据中自动识别PHN,并将其性能与基于代码的方法进行比较。
■这项回顾性研究使用了来自南加州KaiserPermanente的EHR数据,一个为480多万会员提供服务的大型综合医疗保健系统。来源人群包括年龄≥50岁的成员,他们在2018年至2020年之间接受了事故HZ诊断和伴随的抗病毒处方,并且在事故HZ诊断后90-180天内有≥1次遭遇。研究小组手动审查了EHR并确定了PHN病例。对于NLP开发和验证,从来源人群中选择500和800个随机样本,分别。敏感性,特异性,阳性预测值(PPV),负预测值(NPV),F分数,NLP的马修斯相关系数(MCC)和基于代码的方法使用图表审查结果作为参考标准进行评估。
■NLP算法以90.9%的灵敏度识别PHN病例,98.5%的特异性,82%PPV,和99.3%的净现值。NLP算法的综合得分分别为0.89(F-score)和0.85(MCC)。验证数据中PHN的发生率为6.9%(参考标准),7.6%(NLP),和5.4%-13.1%(基于代码)。基于代码的方法实现了52.7%-61.8%的灵敏度,89.8%-98.4%特异性,27.6%-72.1%PPV,净现值96.3%-97.1%。F评分和MCC分别介于0.45和0.59之间以及介于0.32和0.61之间。
■基于自动NLP的方法以良好的准确性从EHR识别PHN病例。该方法可用于基于群体的PHN研究。
UNASSIGNED: Diagnosis codes and prescription data are used in algorithms to identify postherpetic neuralgia (PHN), a debilitating complication of herpes zoster (HZ). Because of the questionable accuracy of codes and prescription data, manual chart review is sometimes used to identify PHN in electronic health records (EHRs), which can be costly and time-consuming.
UNASSIGNED: This study aims to develop and validate a natural language processing (NLP) algorithm for automatically identifying PHN from unstructured EHR data and to compare its performance with that of code-based methods.
UNASSIGNED: This retrospective study used EHR data from Kaiser Permanente Southern California, a large integrated health care system that serves over 4.8 million members. The source population included members aged ≥50 years who received an incident HZ diagnosis and accompanying antiviral prescription between 2018 and 2020 and had ≥1 encounter within 90-180 days of the incident HZ diagnosis. The study team manually reviewed the EHR and identified PHN cases. For NLP development and validation, 500 and 800 random samples from the source population were selected, respectively. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F-score, and Matthews correlation coefficient (MCC) of NLP and the code-based methods were evaluated using chart-reviewed results as the reference standard.
UNASSIGNED: The NLP algorithm identified PHN cases with a 90.9% sensitivity, 98.5% specificity, 82% PPV, and 99.3% NPV. The composite scores of the NLP algorithm were 0.89 (F-score) and 0.85 (MCC). The prevalences of PHN in the validation data were 6.9% (reference standard), 7.6% (NLP), and 5.4%-13.1% (code-based). The code-based methods achieved a 52.7%-61.8% sensitivity, 89.8%-98.4% specificity, 27.6%-72.1% PPV, and 96.3%-97.1% NPV. The F-scores and MCCs ranged between 0.45 and 0.59 and between 0.32 and 0.61, respectively.
UNASSIGNED: The automated NLP-based approach identified PHN cases from the EHR with good accuracy. This method could be useful in population-based PHN research.