关键词: Acute respiratory distress syndrome Cardiac surgery Machine learning Prediction model SHAP value

Mesh : Humans Respiratory Distress Syndrome / etiology Machine Learning Male Female Middle Aged Cohort Studies Cardiac Surgical Procedures / adverse effects Aged ROC Curve Area Under Curve

来  源:   DOI:10.1186/s12967-024-05395-1   PDF(Pubmed)

Abstract:
BACKGROUND: Acute respiratory distress syndrome (ARDS) after cardiac surgery is a severe respiratory complication with high mortality and morbidity. Traditional clinical approaches may lead to under recognition of this heterogeneous syndrome, potentially resulting in diagnosis delay. This study aims to develop and external validate seven machine learning (ML) models, trained on electronic health records data, for predicting ARDS after cardiac surgery.
METHODS: This multicenter, observational cohort study included patients who underwent cardiac surgery in the training and testing cohorts (data from Nanjing First Hospital), as well as those patients who had cardiac surgery in a validation cohort (data from Shanghai General Hospital). The number of important features was determined using the sliding windows sequential forward feature selection method (SWSFS). We developed a set of tree-based ML models, including Decision Tree, GBDT, AdaBoost, XGBoost, LightGBM, Random Forest, and Deep Forest. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and Brier score. The SHapley Additive exPlanation (SHAP) techinque was employed to interpret the ML model. Furthermore, a comparison was made between the ML models and traditional scoring systems. ARDS is defined according to the Berlin definition.
RESULTS: A total of 1996 patients who had cardiac surgery were included in the study. The top five important features identified by the SWSFS were chronic obstructive pulmonary disease, preoperative albumin, central venous pressure_T4, cardiopulmonary bypass time, and left ventricular ejection fraction. Among the seven ML models, Deep Forest demonstrated the best performance, with an AUC of 0.882 and a Brier score of 0.809 in the validation cohort. Notably, the SHAP values effectively illustrated the contribution of the 13 features attributed to the model output and the individual feature\'s effect on model prediction. In addition, the ensemble ML models demonstrated better performance than the other six traditional scoring systems.
CONCLUSIONS: Our study identified 13 important features and provided multiple ML models to enhance the risk stratification for ARDS after cardiac surgery. Using these predictors and ML models might provide a basis for early diagnostic and preventive strategies in the perioperative management of ARDS patients.
摘要:
背景:心脏手术后的急性呼吸窘迫综合征(ARDS)是一种严重的呼吸系统并发症,具有高死亡率和高发病率。传统的临床方法可能导致对这种异质性综合征的认识不足,可能导致诊断延迟。这项研究旨在开发和外部验证七个机器学习(ML)模型,接受过电子健康记录数据的培训,用于预测心脏手术后的ARDS。
方法:这个多中心,观察性队列研究包括接受心脏手术的患者在培训和测试队列中(数据来自南京市第一医院),以及在验证队列中接受心脏手术的患者(数据来自上海市总医院)。使用滑动窗口顺序前向特征选择方法(SWSFS)确定重要特征的数量。我们开发了一套基于树的机器学习模型,包括决策树,GBDT,AdaBoost,XGBoost,LightGBM,随机森林,森林深处。使用接受者工作特征曲线下面积(AUC)和Brier评分评价模型性能。采用Shapley加法扩张(SHAP)技术来解释ML模型。此外,对ML模型和传统评分系统进行了比较.ARDS是根据柏林定义定义的。
结果:共有1996名接受心脏手术的患者被纳入研究。SWSFS确定的前五个重要特征是慢性阻塞性肺疾病,术前白蛋白,中心静脉压T4,体外循环时间,左心室射血分数.在七种机器学习模型中,DeepForest表现最好,验证队列的AUC为0.882,Brier评分为0.809。值得注意的是,SHAP值有效地说明了归因于模型输出的13个特征的贡献以及单个特征对模型预测的影响。此外,集成ML模型表现出比其他六个传统评分系统更好的性能。
结论:我们的研究确定了13个重要特征,并提供了多个ML模型来增强心脏手术后ARDS的风险分层。使用这些预测因子和ML模型可能为ARDS患者围手术期的早期诊断和预防策略提供基础。
公众号