背景:使用机载体积图像的Radiomics分析作为一种在治疗期间预测预后的方法,引起了研究的关注;但是,缺乏标准化仍然是主要问题之一。
目的:本研究使用拟人化放射组学体模研究了影响从机载体积图像中提取的放射组学特征的可重复性的因素。此外,使用来自多个机构的不同治疗机进行了体模实验,作为外部验证,以识别可重复的放射学特征.
方法:体模被设计为35×20×20cm,具有八种类型的异质球体(=1、2和3cm)。使用来自8个机构的15台治疗机获取机载体积图像。其中,千伏锥束计算机断层扫描(kV-CBCT)从一个机构的四个治疗机获得的图像数据被用作内部评估数据集,以探索放射学特征的可重复性。剩余的图像数据,包括kV-CBCT,兆伏CBCT(MV-CBCT),和由七个不同机构(11台治疗机)提供的大型计算机断层摄影(MV-CT),用作外部验证数据集。总共有1,302个放射学特征,包括18个一阶,75纹理,465(即,93×5)基于高斯(LoG)滤波器的拉普拉斯算子,和744(即,93×8)基于小波滤波器的特征,在球体中提取。使用内部评估数据集计算组内相关系数(ICC)以探索特征可重复性和再现性。随后,计算变异系数(COV)以验证外部机构的特征变异性。超过0.85的绝对ICC或低于5%的COV被认为指示高度可再现的特征。
结果:对于内部评估,ICC分析显示,具有高重复性的影像组学特征的中位数百分比为95.2%。ICC分析表明,管间电流的高重现性特征的中位数百分比,重建算法,和治疗机下降了20.8%,29.2%,和33.3%,分别。对于外部验证,COV分析显示,可重复特征的中位数百分比为31.5%.共有16个功能,包括九个基于LoG滤波器和七个基于小波滤波器的特征,被指示为高度可重复的特征。灰度游程长度矩阵(GLRLM)被分类为包含最频繁的特征(N=8),其次是灰度依赖矩阵(N=7)和灰度共生矩阵(N=1)特征。
结论:我们开发了用于kV-CBCT影像组学分析的标准体模,MV-CBCT,和MV-CT图像。有了这个幻影,我们发现,治疗机和图像重建算法的差异降低了影像组学特征的可重复性。具体来说,用于外部验证的最具重现性的特征是基于LoG或小波滤波的GLRLM特征.然而,在将发现结果应用于预后预测之前,应事先在每个机构检查已识别特征的可接受性。本文受版权保护。保留所有权利。
BACKGROUND: Radiomics analysis using on-board volumetric images has attracted research attention as a method for predicting prognosis during treatment; however, the lack of standardization is still one of the main concerns.
OBJECTIVE: This study investigated the factors that influence the reproducibility of radiomic features extracted from on-board volumetric images using an anthropomorphic radiomics phantom. Furthermore, a phantom experiment was conducted with different treatment machines from multiple institutions as external validation to identify reproducible radiomic features.
METHODS: The phantom was designed to be 35 × 20 × 20 cm with eight types of heterogeneous spheres (⌀ = 1, 2, and 3 cm). On-board volumetric images were acquired using 15 treatment machines from eight institutions. Of these, kilovoltage cone-beam computed tomography (kV-CBCT) image data acquired from four treatment machines at one institution were used as an internal evaluation dataset to explore the reproducibility of radiomic features. The remaining image data, including kV-CBCT, megavoltage-CBCT (MV-CBCT), and megavoltage computed tomography (MV-CT) provided by seven different institutions (11 treatment machines), were used as an external validation dataset. A total of 1,302 radiomic features, including 18 first-order, 75 texture, 465 (i.e., 93 × 5) Laplacian of Gaussian (LoG) filter-based, and 744 (i.e., 93 × 8) wavelet filter-based features, were extracted within the spheres. The intraclass correlation coefficient (ICC) was calculated to explore feature repeatability and reproducibility using an internal evaluation dataset. Subsequently, the coefficient of variation (COV) was calculated to validate the feature variability of external institutions. An absolute ICC exceeding 0.85 or COV under 5% was considered indicative of a highly reproducible feature.
RESULTS: For internal evaluation, ICC analysis showed that the median percentage of radiomic features with high repeatability was 95.2%. The ICC analysis indicated that the median percentages of highly reproducible features for inter-tube current, reconstruction algorithm, and treatment machine were decreased by 20.8%, 29.2%, and 33.3%, respectively. For external validation, the COV analysis showed that the median percentage of reproducible features was 31.5%. A total of 16 features, including nine LoG filter-based and seven wavelet filter-based features, were indicated as highly reproducible features. The gray-level run-length matrix (GLRLM) was classified as containing the most frequent features (N = 8), followed by the gray-level dependence matrix (N = 7) and gray-level co-occurrence matrix (N = 1) features.
CONCLUSIONS: We developed the standard phantom for radiomics analysis of kV-CBCT, MV-CBCT, and MV-CT images. With this phantom, we revealed that the differences in the treatment machine and image reconstruction algorithm reduce the reproducibility of radiomic features from on-board volumetric images. Specifically, the most reproducible features for external validation were LoG or wavelet filter-based GLRLM features. However, the acceptability of the identified features should be examined in advance at each institution before applying the findings to prognosis prediction.