在高血压引起的问题中,早期肾损害常被忽视。它不能被诊断,直到病情严重和不可逆转的损害发生。因此,我们决定筛选和探讨高血压患者早期肾损害的相关危险因素,并基于数据挖掘方法建立肾损害预警模型,以实现对高血压患者肾损害的早期诊断。
借助高血压门诊患者电子信息管理系统,我们收集了513箱原件,未经治疗的高血压患者。我们记录了他们的人口统计数据,动态血压参数,血常规指标,和血液生化指标建立临床数据库。然后通过特征工程和随机森林筛选早期肾损害的危险因素,额外的树木,和XGBoost建立预警模型,分别。最后,基于堆叠策略,通过模型融合建立了新的模型。我们使用交叉验证来评估每个模型的稳定性和可靠性,以确定最佳的风险评估模型。
根据重要程度,特征工程选取的特征降序为夜间收缩压下降率,红细胞分布宽度,血压昼夜节律,白天的平均舒张压,体表面积,吸烟,年龄,和HDL。基于Stacking策略的全特征二维融合模型的平均精度为0.89685,选取特征为0.93824,大大提高。
通过特征工程和风险因素分析,我们选择夜间收缩压的下降速度,红细胞分布宽度,血压昼夜节律,白天平均舒张压作为高血压患者早期肾损害的预警因素。在此基础上,基于Stacking策略的二维融合模型比单一模型具有更好的效果,可用于高血压患者早期肾损害的风险评估。
BACKGROUND: Among the problems caused by hypertension, early renal damage is often ignored. It can not be diagnosed until the condition is severe and irreversible damage occurs. So we decided to screen and explore related risk factors for hypertensive patients with early renal damage and establish the early-warning model of renal damage based on the data-mining method to achieve an early diagnosis for hypertensive patients with renal damage.
METHODS: With the aid of an electronic information management system for hypertensive out-patients, we collected 513 cases of original, untreated hypertensive patients. We recorded their demographic data, ambulatory blood pressure parameters, blood routine index, and blood biochemical index to establish the clinical database. Then we screen risk factors for early renal damage through feature engineering and use Random Forest, Extra-Trees, and XGBoost to build an early-warning model, respectively. Finally, we build a new model by model fusion based on the Stacking strategy. We use cross-validation to evaluate the stability and reliability of each model to determine the best risk assessment model.
RESULTS: According to the degree of importance, the descending order of features selected by feature engineering is the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, the average diastolic blood pressure at daytime, body surface area, smoking, age, and HDL. The average precision of the two-dimensional fusion model with full features based on the Stacking strategy is 0.89685, and selected features are 0.93824, which is greatly improved.
CONCLUSIONS: Through feature engineering and risk factor analysis, we select the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, and the average diastolic blood pressure at daytime as early-warning factors of early renal damage in patients with hypertension. On this basis, the two-dimensional fusion model based on the Stacking strategy has a better effect than the single model, which can be used for risk assessment of early renal damage in hypertensive patients.