关键词: Automated machine learning (AutoML) artificial intelligence controlled attenuation parameter nonalcoholic fatty liver disease (NAFLD) shiny application

来  源:   DOI:10.1177/20552076241272535   PDF(Pubmed)

Abstract:
UNASSIGNED: Nonalcoholic fatty liver disease (NAFLD) is recognized as one of the most common chronic liver diseases worldwide. This study aims to assess the efficacy of automated machine learning (AutoML) in the identification of NAFLD using a population-based cross-sectional database.
UNASSIGNED: All data, including laboratory examinations, anthropometric measurements, and demographic variables, were obtained from the National Health and Nutrition Examination Survey (NHANES). NAFLD was defined by controlled attenuation parameter (CAP) in liver transient ultrasound elastography. The least absolute shrinkage and selection operator (LASSO) regression analysis was employed for feature selection. Six algorithms were utilized on the H2O-automated machine learning platform: Gradient Boosting Machine (GBM), Distributed Random Forest (DRF), Extremely Randomized Trees (XRT), Generalized Linear Model (GLM), eXtreme Gradient Boosting (XGBoost), and Deep Learning (DL). These algorithms were selected for their diverse strengths, including their ability to handle complex, non-linear relationships, provide high predictive accuracy, and ensure interpretability. The models were evaluated by area under receiver operating characteristic curves (AUC) and interpreted by the calibration curve, the decision curve analysis, variable importance plot, SHapley Additive exPlanation plot, partial dependence plots, and local interpretable model agnostic explanation plot.
UNASSIGNED: A total of 4177 participants (non-NAFLD 3167 vs NAFLD 1010) were included to develop and validate the AutoML models. The model developed by XGBoost performed better than other models in AutoML, achieving an AUC of 0.859, an accuracy of 0.795, a sensitivity of 0.773, and a specificity of 0.802 on the validation set.
UNASSIGNED: We developed an XGBoost model to better evaluate the presence of NAFLD. Based on the XGBoost model, we created an R Shiny web-based application named Shiny NAFLD (http://39.101.122.171:3838/App2/). This application demonstrates the potential of AutoML in clinical research and practice, offering a promising tool for the real-world identification of NAFLD.
摘要:
非酒精性脂肪性肝病(NAFLD)被认为是全球最常见的慢性肝病之一。本研究旨在使用基于人群的横截面数据库评估自动机器学习(AutoML)在NAFLD识别中的功效。
所有数据,包括实验室检查,人体测量,和人口统计学变量,是从国家健康和营养检查调查(NHANES)获得的。NAFLD由肝脏瞬时超声弹性成像中的受控衰减参数(CAP)定义。采用最小绝对收缩和选择算子(LASSO)回归分析进行特征选择。在H2O自动化机器学习平台上使用了六种算法:梯度助推机(GBM),分布式随机森林(DRF)极随机树(XRT),广义线性模型(GLM),极限梯度提升(XGBoost),深度学习(DL)这些算法因其不同的优势而被选中,包括他们处理复杂问题的能力,非线性关系,提供高预测精度,并确保可解释性。通过受试者工作特征曲线下面积(AUC)评估模型,并通过校准曲线解释。决策曲线分析,变量重要性图,ShapleyAdditiveexplanationplot,部分依赖图,和局部可解释模型不可知的解释图。
共包括4177名参与者(非NAFLD3167与NAFLD1010)来开发和验证AutoML模型。XGBoost开发的模型比AutoML中的其他模型表现更好,在验证集上,AUC为0.859,准确度为0.795,灵敏度为0.773,特异性为0.802。
我们开发了XGBoost模型来更好地评估NAFLD的存在。基于XGBoost模型,我们创建了一个名为ShinyNAFLD的RShinyWeb应用程序(http://39.101.122.171:3838/App2/)。此应用程序展示了AutoML在临床研究和实践中的潜力,为NAFLD的真实世界鉴定提供了一个有前途的工具。
公众号