关键词: OMOP; EHR adverse drug events explainable AI graph neural network

来  源:   DOI:10.1093/jamia/ocae155

Abstract:
OBJECTIVE: The aim of this project was to create time-aware, individual-level risk score models for adverse drug events related to multiple sclerosis disease-modifying therapy and to provide interpretable explanations for model prediction behavior.
METHODS: We used temporal sequences of observational medical outcomes partnership common data model (OMOP CDM) concepts derived from an electronic health record as model features. Each concept was assigned an embedding representation that was learned from a graph convolution network trained on a knowledge graph (KG) of OMOP concept relationships. Concept embeddings were fed into long short-term memory networks for 1-year adverse event prediction following drug exposure. Finally, we implemented a novel extension of the local interpretable model agnostic explanation (LIME) method, knowledge graph LIME (KG-LIME) to leverage the KG and explain individual predictions of each model.
RESULTS: For a set of 4859 patients, we found that our model was effective at predicting 32 out of 56 adverse event types (P < .05) when compared to demographics and past diagnosis as variables. We also assessed discrimination in the form of area under the curve (AUC = 0.77 ± 0.15) and area under the precision-recall curve (AUC-PR = 0.31 ± 0.27) and assessed calibration in the form of Brier score (BS = 0.04 ± 0.04). Additionally, KG-LIME generated interpretable literature-validated lists of relevant medical concepts used for prediction.
CONCLUSIONS: Many of our risk models demonstrated high calibration and discrimination for adverse event prediction. Furthermore, our novel KG-LIME method was able to utilize the knowledge graph to highlight concepts that were important to prediction. Future work will be required to further explore the temporal window of adverse event occurrence beyond the generic 1-year window used here, particularly for short-term inpatient adverse events and long-term severe adverse events.
摘要:
目标:这个项目的目的是创建时间感知,与多发性硬化疾病改善治疗相关的药物不良事件的个体水平风险评分模型,并为模型预测行为提供可解释的解释。
方法:我们使用从电子健康记录中导出的观察性医疗结果的时间序列伙伴关系通用数据模型(OMOPCDM)概念作为模型特征。每个概念都被分配了一个嵌入表示,该表示是从在OMOP概念关系的知识图(KG)上训练的图卷积网络中学习的。将概念嵌入输入长期短期记忆网络,以预测药物暴露后的1年不良事件。最后,我们实现了局部可解释模型不可知解释(LIME)方法的新扩展,知识图LIME(KG-LIME)来利用KG并解释每个模型的个体预测。
结果:对于一组4859名患者,我们发现,我们的模型可有效预测56种不良事件类型中的32种(P<.05),将人口统计学和既往诊断作为变量进行比较.我们还以曲线下面积(AUC=0.77±0.15)和精确召回曲线下面积(AUC-PR=0.31±0.27)的形式评估了歧视,并以Brier评分(BS=0.04±0.04)的形式评估了校准。此外,KG-LIME生成了用于预测的相关医学概念的可解释文献验证列表。
结论:我们的许多风险模型证明了不良事件预测的高度校准和辨别。此外,我们新颖的KG-LIME方法能够利用知识图来突出显示对预测很重要的概念。未来的工作将需要进一步探索不良事件发生的时间窗口,超出此处使用的通用1年窗口。特别是短期住院不良事件和长期严重不良事件。
公众号