%0 Journal Article %T Transformers for colorectal cancer segmentation in CT imaging. %A Hille G %A Tummala P %A Spitz L %A Saalfeld S %J Int J Comput Assist Radiol Surg %V 0 %N 0 %D 2024 Jul 4 %M 38965166 %F 3.421 %R 10.1007/s11548-024-03217-9 %X OBJECTIVE: Most recently transformer models became the state of the art in various medical image segmentation tasks and challenges, outperforming most of the conventional deep learning approaches. Picking up on that trend, this study aims at applying various transformer models to the highly challenging task of colorectal cancer (CRC) segmentation in CT imaging and assessing how they hold up to the current state-of-the-art convolutional neural network (CNN), the nnUnet. Furthermore, we wanted to investigate the impact of the network size on the resulting accuracies, since transformer models tend to be significantly larger than conventional network architectures.
METHODS: For this purpose, six different transformer models, with specific architectural advancements and network sizes were implemented alongside the aforementioned nnUnet and were applied to the CRC segmentation task of the medical segmentation decathlon.
RESULTS: The best results were achieved with the Swin-UNETR, D-Former, and VT-Unet, each transformer models, with a Dice similarity coefficient (DSC) of 0.60, 0.59 and 0.59, respectively. Therefore, the current state-of-the-art CNN, the nnUnet could be outperformed by transformer architectures regarding this task. Furthermore, a comparison with the inter-observer variability (IOV) of approx. 0.64 DSC indicates almost expert-level accuracy. The comparatively low IOV emphasizes the complexity and challenge of CRC segmentation, as well as indicating limitations regarding the achievable segmentation accuracy.
CONCLUSIONS: As a result of this study, transformer models underline their current upward trend in producing state-of-the-art results also for the challenging task of CRC segmentation. However, with ever smaller advances in total accuracies, as demonstrated in this study by the on par performances of multiple network variants, other advantages like efficiency, low computation demands, or ease of adaption to new tasks become more and more relevant.