Mesh : Adult Female Humans Male Alcohol Drinking Area Under Curve Asian People Calibration Hyperuricemia / epidemiology etiology Risk Factors Risk Assessment Clinical Decision Rules Models, Statistical China / epidemiology

来  源:   DOI:10.1039/d3fo01363d

Abstract:
This study aims to establish a simple and non-invasive risk prediction model for hyperuricemia in Chinese adults based on modifiable risk factors. In 2020-2021, the baseline survey of the Beijing Health Management Cohort (BHMC) was conducted in Beijing city among the health examination population. Diverse life-style risk factors including dietary patterns and habits, cigarette smoking, alcohol intake, sleep duration and cell-phone use were collected. We developed hyperuricemia prediction models using three machine-learning techniques, namely logistic regression (LR), random forest (RF), and XGBoost. Performances in discrimination, calibration, and clinical applicability of the three methods were compared. Decision curve analysis (DCA) was used to assess the model\'s clinical usefulness. A total of 74 050 people were included in the study, of whom 55 537 (75%) were randomly selected into the training set and the other 18 513 (25%) were in the validation set. The prevalence of HUA was 38.43% in men and 13.29% in women. The XGBoost model has better performance than the LR and RF models. The area under the curve (AUC) (95% CI) in the training set for the LR, RF and XGBoost models were 0.754 (0.750-0.757), 0.844 (0.841-0.846) and 0.854 (0.851-0.856), respectively. The XGBoost model had a higher classification accuracy of 0.774 than the logistic (0.592) and RF (0.767) models. The AUC (95% CI) values in the validation set for the LR, RF and XGBoost models were 0.758 (0.749-0.765), 0.809 (0.802-0.816) and 0.820 (0.813-0.827), respectively. As demonstrated by the DCA curves, all the three models could bring net benefits within the appropriate threshold probability. XGBoost had better discrimination and accuracy. Various modifiable risk factors included in the model were helpful in facilitating the easy identification and life-style interventions of the HUA high-risk population.
摘要:
本研究旨在建立基于可修改危险因素的中国成年人高尿酸血症简单、无创的风险预测模型。2020-2021年,北京市健康管理队列(BHMC)基线调查在北京市健康体检人群中进行。不同的生活方式风险因素,包括饮食模式和习惯,吸烟,酒精摄入量,收集睡眠时间和手机使用情况.我们使用三种机器学习技术开发了高尿酸血症预测模型,即逻辑回归(LR),随机森林(RF),XGBoost歧视的表现,校准,比较3种方法的临床适用性。使用决策曲线分析(DCA)评估模型的临床有用性。共有74,050人被纳入研究,其中55537人(75%)被随机选入训练集,另外18513人(25%)被选入验证集.HUA的患病率男性为38.43%,女性为13.29%。XGBoost模型具有比LR和RF模型更好的性能。LR训练集中的曲线下面积(AUC)(95%CI),RF和XGBoost模型为0.754(0.750-0.757),0.844(0.841-0.846)和0.854(0.851-0.856),分别。XGBoost模型的分类精度比逻辑(0.592)和RF(0.767)模型高,为0.774。LR的验证集中的AUC(95%CI)值,RF和XGBoost模型为0.758(0.749-0.765),0.809(0.802-0.816)和0.820(0.813-0.827),分别。正如DCA曲线所示,所有这三种模型都可以在适当的阈值概率范围内带来净收益.XGBoost具有较好的辨别力和准确性。模型中包含的各种可修改的风险因素有助于HUA高危人群的轻松识别和生活方式干预。
公众号