关键词: CatBoost Diabetic retinopathy Machine learning Prediction model SHAP

来  源:   DOI:10.1016/j.heliyon.2024.e29497   PDF(Pubmed)

Abstract:
UNASSIGNED: Diabetic retinopathy is one of the major complications of diabetes. In this study, a diabetic retinopathy risk prediction model integrating machine learning models and SHAP was established to increase the accuracy of risk prediction for diabetic retinopathy, explain the rationality of the findings from model prediction and improve the reliability of prediction results.
UNASSIGNED: Data were preprocessed for missing values and outliers, features selected through information gain, a diabetic retinopathy risk prediction model established using the CatBoost and the outputs of the mode interpreted using the SHAP model.
UNASSIGNED: One thousand early warning data of diabetes complications derived from diabetes complication early warning dataset from the National Clinical Medical Sciences Data Center were used in this study. The CatBoost-based model for diabetic retinopathy prediction performed the best in the comparative model test. ALB_CR, HbA1c, UPR_24, NEPHROPATHY and SCR were positively correlated with diabetic retinopathy, while CP, HB, ALB, DBILI and CRP were negatively correlated with diabetic retinopathy. The relationships between HEIGHT, WEIGHT and ESR characteristics and diabetic retinopathy were not significant.
UNASSIGNED: The risk factors for diabetic retinopathy include poor renal function, elevated blood glucose level, liver disease, hematonosis and dysarteriotony, among others. Diabetic retinopathy can be prevented by monitoring and effectively controlling relevant indices. In this study, the influence relationships between the features were also analyzed to further explore the potential factors of diabetic retinopathy, which can provide new methods and new ideas for the early prevention and clinical diagnosis of subsequent diabetic retinopathy.
摘要:
糖尿病视网膜病变是糖尿病的主要并发症之一。在这项研究中,为了提高糖尿病视网膜病变风险预测的准确性,建立了融合机器学习模型和SHAP的糖尿病视网膜病变风险预测模型,解释模型预测结果的合理性,提高预测结果的可靠性。
对缺失值和异常值的数据进行了预处理,通过信息增益选择的特征,使用CatBoost建立的糖尿病视网膜病变风险预测模型和使用SHAP模型解释的模式的输出。
本研究使用了来自国家临床医学科学数据中心的糖尿病并发症预警数据集的一千个糖尿病并发症预警数据。基于CatBoost的糖尿病视网膜病变预测模型在对比模型试验中表现最好。ALB_CR,HbA1c,UPR_24、肾病和SCR与糖尿病视网膜病变呈正相关,而CP,HB,ALB,DBILI和CRP与糖尿病视网膜病变呈负相关。HEIGHT之间的关系,WIGHT和ESR特点与糖尿病视网膜病变无显著关系。
糖尿病视网膜病变的危险因素包括肾功能差,血糖水平升高,肝病,血液病和动脉收缩异常,在其他人中。通过监测和有效控制相关指标可预防糖尿病视网膜病变。在这项研究中,分析各特征间的影响关系,进一步探讨糖尿病视网膜病变的潜在因素,可为后续糖尿病视网膜病变的早期预防和临床诊断提供新方法和新思路。
公众号