针对高肝内胆管癌发病率年龄组患者的机器学习模型的开发。Development of machine learning models for patients in the high intrahepatic cholangiocarcinoma incidence age group.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

BACKGROUND: Intrahepatic cholangiocarcinoma (ICC) has a poor prognosis and is understudied. Based on the clinical features of patients with ICC, we constructed machine learning models to understand their importance on survival and to accurately determine patient prognosis, aiming to develop reference values to guide physicians in developing more effective treatment plans.
METHODS: This study used machine learning (ML) algorithms to build prediction models using ICC data on 1,751 patients from the SEER (Surveillance, Epidemiology, and End Results) database and 58 hospital cases. The models\' performances were compared using receiver operating characteristic curve analysis, C-index, and Brier scores.
RESULTS: A total of eight variables were used to construct the ML models. Our analysis identified the random survival forest model as the best for prognostic prediction. In the training cohort, its C-index, Brier score, and Area Under the Curve values were 0.76, 0.124, and 0.882, respectively, and it also performed well in the test cohort. Kaplan-Meier survival analysis revealed that the model could effectively determine patient prognosis.
CONCLUSIONS: To our knowledge, this is the first study to develop ML prognostic models for ICC in the high-incidence age group. Of the ML models, the random survival forest model was best at prognosis prediction.

摘要：

背景：肝内胆管癌（ICC）预后不良，研究不足。根据ICC患者的临床特点,我们构建了机器学习模型，以了解它们对生存的重要性并准确确定患者的预后，旨在制定参考值，指导医生制定更有效的治疗方案。
方法：本研究使用机器学习（ML）算法，使用来自SEER的1,751名患者的ICC数据建立预测模型（监测，流行病学,和最终结果）数据库和58例医院病例。使用接收器工作特性曲线分析比较了模型的性能，C指数，和Brier得分.
结果：总共8个变量用于构建ML模型。我们的分析确定了随机生存森林模型是预后预测的最佳方法。在训练组中，其C指数，Brier得分,和曲线下面积值分别为0.76、0.124和0.882，它在测试队列中也表现良好。Kaplan-Meier生存分析显示该模型能有效判断患者预后。
结论：据我们所知，这是首个在高发病率年龄组中建立ICCML预后模型的研究.在ML模型中，随机生存森林模型预测预后效果最好。