关键词: Prognosis SHAP balanced random forest coronary heart disease comorbid with hypertension ensemble learning

来  源:   DOI:10.2147/RMHP.S472398   PDF(Pubmed)

Abstract:
UNASSIGNED: This study sought to develop an unbalanced-ensemble model that could accurately predict death outcomes of patients with comorbid coronary heart disease (CHD) and hypertension and evaluate the factors contributing to death.
UNASSIGNED: Medical records of 1058 patients with coronary heart disease combined with hypertension and excluding those acute coronary syndrome were collected. Patients were followed-up at the first, third, sixth, and twelfth months after discharge to record death events. Follow-up ended two years after discharge. Patients were divided into survival and nonsurvival groups. According to medical records, gender, smoking, drinking, COPD, cerebral stroke, diabetes, hyperhomocysteinemia, heart failure and renal insufficiency of the two groups were sorted and compared and other influencing factors of the two groups, feature selection was carried out to construct models. Owing to data unbalance, we developed four unbalanced-ensemble prediction models based on Balanced Random Forest (BRF), EasyEnsemble, RUSBoost, SMOTEBoost and the two base classification algorithms based on AdaBoost and Logistic. Each model was optimised using hyperparameters based on GridSearchCV and evaluated using area under the curve (AUC), sensitivity, recall, Brier score, and geometric mean (G-mean). Additionally, to understand the influence of variables on model performance, we constructed a SHapley Additive explanation (SHAP) model based on the optimal model.
UNASSIGNED: There were significant differences in age, heart rate, COPD, cerebral stroke, heart failure and renal insufficiency in the nonsurvival group compared with the survival group. Among all models, BRF yielded the highest AUC (0.810; 95% CI, 0.778-0.839), sensitivity (0.990; 95% CI, 0.981-1.000), recall (0.990; 95% CI, 0.981-1.000), and G-mean (0.806; 95% CI, 0.778-0.827), and the lowest Brier score (0.181; 95% CI, 0.178-0.185). Therefore, we identified BRF as the optimal model. Furthermore, red blood cell count (RBC), body mass index (BMI), and lactate dehydrogenase were found to be important mortality-associated risk factors.
UNASSIGNED: BRF combined with advanced machine learning methods and SHAP is highly effective and accurately predicts mortality in patients with CHD comorbid with hypertension. This model has the potential to assist clinicians in modifying treatment strategies to improve patient outcomes.
摘要:
本研究旨在建立一种不平衡的集合模型,该模型可以准确预测冠心病(CHD)合并高血压患者的死亡结果,并评估导致死亡的因素。
收集1058例冠心病合并高血压患者的病历,并排除急性冠脉综合征。患者在第一次随访时,第三,第六,出院后第12个月记录死亡事件。出院后两年随访结束。患者分为存活组和非存活组。根据医疗记录,性别,吸烟,饮酒,COPD,脑中风,糖尿病,高同型半胱氨酸血症,对两组心力衰竭和肾功能不全等影响因素进行排序比较,进行特征选择以构建模型。由于数据失衡,我们开发了四种基于平衡随机森林(BRF)的不平衡集成预测模型,EasyEnsemble,RUSBoost,SMOTEBoost和基于AdaBoost和Logistic的两种基分类算法。每个模型使用基于GridSearchCV的超参数进行优化,并使用曲线下面积(AUC)进行评估,灵敏度,召回,Brier得分,和几何平均值(G-mean)。此外,为了理解变量对模型性能的影响,在最优模型的基础上构建了SHapley加法解释(SHAP)模型。
年龄差异显著,心率,COPD,脑中风,与存活组相比,非存活组的心力衰竭和肾功能不全。在所有型号中,BRF的AUC最高(0.810;95%CI,0.778-0.839),灵敏度(0.990;95%CI,0.981-1.000),召回(0.990;95%CI,0.981-1.000),和G-平均值(0.806;95%CI,0.778-0.827),最低的Brier得分(0.181;95%CI,0.178-0.185)。因此,我们确定BRF为最优模型。此外,红细胞计数(RBC),体重指数(BMI),发现乳酸脱氢酶是重要的死亡相关危险因素.
BRF与先进的机器学习方法和SHAP相结合非常有效,可以准确预测CHD合并高血压患者的死亡率。该模型有可能帮助临床医生修改治疗策略以改善患者预后。
公众号