目的:本研究利用提取的计算机断层扫描影像组学特征,通过主流机器学习方法对肝细胞癌的大体肿瘤体积和正常肝组织进行分类,建立自动分类模型。
方法:我们招募了104例经病理证实的肝细胞癌患者进行这项研究。将GTV和正常肝组织样品手动分割成感兴趣区域,并随机分成5倍交叉验证组。使用LASSO回归进行降维。通过逻辑回归构建影像组学模型,支持向量机(SVM),随机森林,Xgboost,和Adaboost算法。诊断效能,歧视,使用接收器工作特征曲线下面积(AUC)分析和校准图比较来验证算法的校准。
结果:七个筛选的影像组学特征在区分大体肿瘤面积方面表现出色。Xgboost机器学习算法具有最佳的辨别和综合诊断性能,AUC为0.9975[95%置信区间(CI):0.9973-0.9978],平均MCC为0.9369。SVM具有第二好的辨别和诊断性能,AUC为0.9846(95%CI:0.983-0.9857),平均马修斯相关系数(MCC)为0.9105,校准效果较好。所有其他算法显示出区分总体肿瘤面积和正常肝组织的出色能力(Adaboost的平均AUC0.9825,0.9861,0.9727,0.9644,随机森林,逻辑回归,分别为naivemBayes算法)。
结论:基于机器学习算法的CT影像组学可以准确地对GTV和正常肝组织进行分类,而Xgboost和SVM算法是最好的互补算法。
OBJECTIVE: The present study utilized extracted computed tomography radiomics features to classify the gross tumor volume and normal liver tissue in hepatocellular carcinoma by mainstream machine learning methods, aiming to establish an automatic classification model.
METHODS: We recruited 104 pathologically confirmed hepatocellular carcinoma patients for this study. GTV and normal liver tissue samples were manually segmented into regions of interest and randomly divided into five-fold cross-validation groups. Dimensionality reduction using LASSO regression. Radiomics models were constructed via logistic regression, support vector machine (SVM), random forest, Xgboost, and Adaboost algorithms. The diagnostic efficacy, discrimination, and calibration of algorithms were verified using area under the receiver operating characteristic curve (AUC) analyses and calibration plot comparison.
RESULTS: Seven screened radiomics features excelled at distinguishing the gross tumor area. The Xgboost machine learning algorithm had the best discrimination and comprehensive diagnostic performance with an AUC of 0.9975 [95% confidence interval (CI): 0.9973-0.9978] and mean MCC of 0.9369. SVM had the second best discrimination and diagnostic performance with an AUC of 0.9846 (95% CI: 0.9835- 0.9857), mean Matthews correlation coefficient (MCC)of 0.9105, and a better calibration. All other algorithms showed an excellent ability to distinguish between gross tumor area and normal liver tissue (mean AUC 0.9825, 0.9861,0.9727,0.9644 for Adaboost, random forest, logistic regression, naivem Bayes algorithm respectively).
CONCLUSIONS: CT radiomics based on machine learning algorithms can accurately classify GTV and normal liver tissue, while the Xgboost and SVM algorithms served as the best complementary algorithms.