[使用可解释的机器学习预测慢性心力衰竭并发肺部感染患者的院内死亡风险] 。[Prediction of risk of in-hospital death in patients with chronic heart failure complicated by lung infections using interpretable machine learning].-医云文献数字医云科研云海量医学决策数据服务

Abstract：

OBJECTIVE: To predict the risk of in-hospital death in patients with chronic heart failure (CHF) complicated by lung infections using interpretable machine learning.
METHODS: The clinical data of 1415 patients diagnosed with CHF complicated by lung infections were obtained from the MIMIC-IV database. According to the pathogen type, the patients were categorized into bacterial pneumonia and non-bacterial pneumonia groups, and their risks of in-hospital death were compared using Kaplan-Meier survival curves. Univariate analysis and LASSO regression were used to select the features for constructing LR, AdaBoost, XGBoost, and LightGBM models, and their performance was compared in terms of accuracy, precision, F1 value, and AUC. External validation of the models was performed using the data from eICU-CRD database. SHAP algorithm was applied for interpretive analysis of XGBoost model.
RESULTS: Among the 4 constructed models, the XGBoost model showed the highest accuracy and F1 value for predicting the risk of in-hospital death in CHF patients with lung infections in the training set. In the external test set, the XGBoost model had an AUC of 0.691 (95% CI: 0.654-0.720) in bacterial pneumonia group and an AUC of 0.725 (95% CI: 0.577-0.782) in non-bacterial pneumonia group, and showed better predictive ability and stability than the other models.
CONCLUSIONS: The overall performance of the XGBoost model is superior to the other 3 models for predicting the risk of in-hospital death in CHF patients with lung infections. The SHAP algorithm provides a clear interpretation of the model to facilitate decision-making in clinical settings.

摘要：

目的：使用可解释的机器学习来预测慢性心力衰竭（CHF）并发肺部感染患者的院内死亡风险。
方法：1415例CHF合并肺部感染患者的临床资料来源于MIMIC-IV数据库。根据病原体类型,患者分为细菌性肺炎和非细菌性肺炎组，并使用Kaplan-Meier存活曲线比较了他们的院内死亡风险.单因素分析和LASSO回归用于选择构建LR的特征，AdaBoost,XGBoost,和LightGBM模型，并在准确性方面比较了它们的性能，精度,F1值，AUC。使用来自eICU-CRD数据库的数据进行模型的外部验证。采用SHAP算法对XGBoost模型进行解释分析。
结果：在4个构建的模型中，在训练集中,XGBoost模型在预测有肺部感染的CHF患者院内死亡风险方面显示出最高的准确性和F1值.在外部测试集中，XGBoost模型在细菌性肺炎组中的AUC为0.691（95％CI：0.654-0.720），在非细菌性肺炎组中的AUC为0.725（95％CI：0.577-0.782），并显示出比其他模型更好的预测能力和稳定性。
结论：XGBoost模型在预测CHF合并肺部感染患者院内死亡风险方面的总体表现优于其他3种模型。SHAP算法提供了模型的清晰解释，以促进临床环境中的决策。