关键词: Shapley additive explanations gradient boosting machine machine learning ocular metastases in gastric adenocarcinoma risk factor

Mesh : Humans Reproducibility of Results Retrospective Studies Adenocarcinoma Algorithms Eye Neoplasms Machine Learning Stomach Neoplasms

来  源:   DOI:10.1177/15330338231219352   PDF(Pubmed)

Abstract:
Background: Although gastric adenocarcinoma (GA) related ocular metastasis (OM) is rare, its occurrence indicates a more severe disease. We aimed to utilize machine learning (ML) to analyze the risk factors of GA-related OM and predict its risks. Methods: This is a retrospective cohort study. The clinical data of 3532 GA patients were collected and randomly classified into training and validation sets in a ratio of 7:3. Those with or without OM were classified into OM and non-OM (NOM) groups. Univariate and multivariate logistic regression analyses and least absolute shrinkage and selection operator were conducted. We integrated the variables identified through feature importance ranking and further refined the selection process using forward sequential feature selection based on random forest (RF) algorithm before incorporating them into the ML model. We applied six ML algorithms to construct the predictive GA model. The area under the receiver operating characteristic (ROC) curve indicated the model\'s predictive ability. Also, we established a network risk calculator based on the best performance model. We used Shapley additive interpretation (SHAP) to identify risk factors and to confirm the interpretability of the black box model. We have de-identified all patient details. Results: The ML model, consisting of 13 variables, achieved an optimal predictive performance using the gradient boosting machine (GBM) model, with an impressive area under the curve (AUC) of 0.997 in the test set. Utilizing the SHAP method, we identified crucial factors for OM in GA patients, including LDL, CA724, CEA, AFP, CA125, Hb, CA153, and Ca2+. Additionally, we validated the model\'s reliability through an analysis of two patient cases and developed a functional online web prediction calculator based on the GBM model. Conclusion: We used the ML method to establish a risk prediction model for GA-related OM and showed that GBM performed best among the six ML models. The model may identify patients with GA-related OM to provide early and timely treatment.
摘要:
背景:尽管胃腺癌(GA)相关的眼部转移(OM)很少见,它的发生表明疾病更严重。我们旨在利用机器学习(ML)来分析GA相关OM的风险因素并预测其风险。方法:本研究为回顾性队列研究。收集3532名GA患者的临床数据,并以7:3的比例随机分为训练集和验证集。具有或不具有OM的那些被分类为OM和非OM(NOM)组。进行了单变量和多变量逻辑回归分析以及最小绝对收缩率和选择算子。我们集成了通过特征重要性排名识别的变量,并在将其纳入ML模型之前,使用基于随机森林(RF)算法的正向顺序特征选择进一步完善了选择过程。我们应用了六种ML算法来构建预测GA模型。受试者工作特征(ROC)曲线下的面积表明了模型的预测能力。此外,建立了基于最佳性能模型的网络风险计算器。我们使用Shapley加性解释(SHAP)来识别风险因素并确认黑盒模型的可解释性。我们已经取消了所有患者的详细信息。结果:ML模型,由13个变量组成,使用梯度增强机(GBM)模型实现了最佳预测性能,测试集中曲线下面积(AUC)为0.997。利用SHAP方法,我们确定了GA患者OM的关键因素,包括LDL,CA724,CEA,法新社,CA125,Hb,CA153和Ca2+。此外,我们通过对2例患者病例的分析验证了模型的可靠性,并开发了基于GBM模型的功能性在线网络预测计算器.结论:我们使用ML方法建立了GA相关OM的风险预测模型,并表明GBM在六个ML模型中表现最好。该模型可以识别患有GA相关OM的患者以提供早期和及时的治疗。
公众号