关键词: Computed tomography Magnetic resonance imaging Radiomics Reproducibility Segmentation

Mesh : Humans Reproducibility of Results Retrospective Studies Consensus Carcinoma, Renal Cell / pathology Kidney Neoplasms / pathology Image Processing, Computer-Assisted / methods

来  源:   DOI:10.1016/j.ejrad.2023.110893

Abstract:
OBJECTIVE: To evaluate the reliability of consensus-based segmentation in terms of reproducibility of radiomic features.
METHODS: In this retrospective study, three tumor data sets were investigated: breast cancer (n = 30), renal cell carcinoma (n = 30), and pituitary macroadenoma (n = 30). MRI was utilized for breast and pituitary data sets, while CT was used for renal data set. 12 readers participated in the segmentation process. Consensus segmentation was created by making corrections on a previous region or volume of interest. Four experiments were designed to evaluate the reproducibility of radiomic features. Reliability was assessed with intraclass correlation coefficient (ICC) with two cut-off values: 0.75 and 0.9.
RESULTS: Considering the lower bound of the 95% confidence interval and the ICC threshold of 0.90, at least 61% of the radiomic features were not reproducible in the inter-consensus analysis. In the susceptibility experiment, at least half (54%) became non-reproducible when the first reader is replaced with a different reader. In the intra-consensus analysis, at least about one-third (32%) were non-reproducible when the same second reader segmented the image over the same first reader two weeks later. Compared to inter-reader analysis based on independent single readers, the inter-consensus analysis did not statistically significantly improve the rates of reproducible features in all data sets and analyses.
CONCLUSIONS: Despite the positive connotation of the word \"consensus\", it is essential to REMIND that consensus-based segmentation has significant reproducibility issues. Therefore, the usage of consensus-based segmentation alone should be avoided unless a reliability analysis is performed, even if it is not practical in clinical settings.
摘要:
目的:从影像组学特征的可重复性方面评估基于共识的分割的可靠性。
方法:在这项回顾性研究中,研究了三个肿瘤数据集:乳腺癌(n=30),肾细胞癌(n=30),和垂体大腺瘤(n=30)。MRI用于乳腺和垂体数据集,而CT用于肾脏数据集。12位读者参与了细分过程。通过对感兴趣的先前区域或体积进行校正来创建共识分割。设计了四个实验来评估放射学特征的可重复性。使用具有两个截止值的组内相关系数(ICC)评估可靠性:0.75和0.9。
结果:考虑到95%置信区间的下限和0.90的ICC阈值,在共识间分析中,至少61%的影像组学特征是不可重复的。在敏感性实验中,当第一个阅读器替换为不同的阅读器时,至少一半(54%)变得不可再现。在内部共识分析中,当相同的第二阅读器在两周后分割相同的第一阅读器的图像时,至少约三分之一(32%)是不可再现的.与基于独立单一读者的读者间分析相比,在所有数据集和分析中,共识间分析均未显著改善可重复特征的发生率.
结论:尽管“共识”一词具有积极的含义,必须提醒的是,基于共识的分割具有显著的可重复性问题.因此,除非进行可靠性分析,否则应避免单独使用基于共识的细分,即使它在临床环境中不实用。
公众号