关键词: Covariate shift Covid-19 Machine learning Model degradation

Mesh : Humans COVID-19 Pandemics Algorithms Neural Networks, Computer Machine Learning

来  源:   DOI:10.1186/s12911-023-02151-1

Abstract:
Machine-learning models are susceptible to external influences which can result in performance deterioration. The aim of our study was to elucidate the impact of a sudden shift in covariates, like the one caused by the Covid-19 pandemic, on model performance.
After ethical approval and registration in Clinical Trials (NCT04092933, initial release 17/09/2019), we developed different models for the prediction of perioperative mortality based on preoperative data: one for the pre-pandemic data period until March 2020, one including data before the pandemic and from the first wave until May 2020, and one that covers the complete period before and during the pandemic until October 2021. We applied XGBoost as well as a Deep Learning neural network (DL). Performance metrics of each model during the different pandemic phases were determined, and XGBoost models were analysed for changes in feature importance.
XGBoost and DL provided similar performance on the pre-pandemic data with respect to area under receiver operating characteristic (AUROC, 0.951 vs. 0.942) and area under precision-recall curve (AUPR, 0.144 vs. 0.187). Validation in patient cohorts of the different pandemic waves showed high fluctuations in performance from both AUROC and AUPR for DL, whereas the XGBoost models seemed more stable. Change in variable frequencies with onset of the pandemic were visible in age, ASA score, and the higher proportion of emergency operations, among others. Age consistently showed the highest information gain. Models based on pre-pandemic data performed worse during the first pandemic wave (AUROC 0.914 for XGBoost and DL) whereas models augmented with data from the first wave lacked performance after the first wave (AUROC 0.907 for XGBoost and 0.747 for DL). The deterioration was also visible in AUPR, which worsened by over 50% in both XGBoost and DL in the first phase after re-training.
A sudden shift in data impacts model performance. Re-training the model with updated data may cause degradation in predictive accuracy if the changes are only transient. Too early re-training should therefore be avoided, and close model surveillance is necessary.
摘要:
机器学习模型容易受到外部影响,这可能导致性能下降。我们研究的目的是阐明协变量突然变化的影响,就像Covid-19大流行引起的那样,关于模型性能。
经过临床试验的伦理批准和注册(NCT04092933,初始版本17/09/2019),我们基于术前数据开发了预测围手术期死亡率的不同模型:一个模型用于大流行前数据期至2020年3月,一个模型包括大流行前和第一波至2020年5月的数据,另一个模型涵盖大流行前和大流行期间至2021年10月的完整时间段.我们应用了XGBoost以及深度学习神经网络(DL)。确定了每个模型在不同大流行阶段的性能指标,和XGBoost模型分析了特征重要性的变化。
XGBoost和DL在大流行前数据上提供了关于接收器工作特性下区域的类似性能(AUROC,0.951vs.0.942)和精确召回曲线下的面积(AUPR,0.144vs.0.187)。在不同大流行波的患者队列中的验证显示,DL的AUROC和AUPR的表现波动很大,而XGBoost模型似乎更稳定。随着大流行的发作,可变频率的变化在年龄上可见,ASA得分,紧急行动的比例越高,在其他人中。年龄始终显示出最高的信息增益。基于大流行前数据的模型在第一波大流行期间表现较差(XGBoost和DL为AUROC0.914),而用第一波数据增强的模型在第一波之后缺乏性能(XGBoost为AUROC0.907,DL为0.747)。在AUPR中也可以看到恶化,在重新训练后的第一阶段,XGBoost和DL的情况恶化了50%以上。
数据的突然变化会影响模型性能。如果改变仅是瞬时的,则用更新的数据重新训练模型可能导致预测准确性的降低。因此应避免过早的再训练,密切的模型监测是必要的。
公众号