METHODS: This study used machine learning (ML) algorithms to build prediction models using ICC data on 1,751 patients from the SEER (Surveillance, Epidemiology, and End Results) database and 58 hospital cases. The models\' performances were compared using receiver operating characteristic curve analysis, C-index, and Brier scores.
RESULTS: A total of eight variables were used to construct the ML models. Our analysis identified the random survival forest model as the best for prognostic prediction. In the training cohort, its C-index, Brier score, and Area Under the Curve values were 0.76, 0.124, and 0.882, respectively, and it also performed well in the test cohort. Kaplan-Meier survival analysis revealed that the model could effectively determine patient prognosis.
CONCLUSIONS: To our knowledge, this is the first study to develop ML prognostic models for ICC in the high-incidence age group. Of the ML models, the random survival forest model was best at prognosis prediction.
方法:本研究使用机器学习(ML)算法,使用来自SEER的1,751名患者的ICC数据建立预测模型(监测,流行病学,和最终结果)数据库和58例医院病例。使用接收器工作特性曲线分析比较了模型的性能,C指数,和Brier得分.
结果:总共8个变量用于构建ML模型。我们的分析确定了随机生存森林模型是预后预测的最佳方法。在训练组中,其C指数,Brier得分,和曲线下面积值分别为0.76、0.124和0.882,它在测试队列中也表现良好。Kaplan-Meier生存分析显示该模型能有效判断患者预后。
结论:据我们所知,这是首个在高发病率年龄组中建立ICCML预后模型的研究.在ML模型中,随机生存森林模型预测预后效果最好。