关键词: Transformer dual‐branch structure medical image segmentation semi‐supervised learning

来  源:   DOI:10.1002/acm2.14483

Abstract:
OBJECTIVE: In recent years, the use of deep learning for medical image segmentation has become a popular trend, but its development also faces some challenges. Firstly, due to the specialized nature of medical data, precise annotation is time-consuming and labor-intensive. Training neural networks effectively with limited labeled data is a significant challenge in medical image analysis. Secondly, convolutional neural networks commonly used for medical image segmentation research often focus on local features in images. However, the recognition of complex anatomical structures or irregular lesions often requires the assistance of both local and global information, which has led to a bottleneck in its development. Addressing these two issues, in this paper, we propose a novel network architecture.
METHODS: We integrate a shift window mechanism to learn more comprehensive semantic information and employ a semi-supervised learning strategy by incorporating a flexible amount of unlabeled data. Specifically, a typical U-shaped encoder-decoder structure is applied to obtain rich feature maps. Each encoder is designed as a dual-branch structure, containing Swin modules equipped with windows of different size to capture features of multiple scales. To effectively utilize unlabeled data, a level set function is introduced to establish consistency between the function regression and pixel classification.
RESULTS: We conducted experiments on the COVID-19 CT dataset and DRIVE dataset and compared our approach with various semi-supervised and fully supervised learning models. On the COVID-19 CT dataset, we achieved a segmentation accuracy of up to 74.56%. Our segmentation accuracy on the DRIVE dataset was 79.79%.
CONCLUSIONS: The results demonstrate the outstanding performance of our method on several commonly used evaluation metrics. The high segmentation accuracy of our model demonstrates that utilizing Swin modules with different window sizes can enhance the feature extraction capability of the model, and the level set function can enable semi-supervised models to more effectively utilize unlabeled data. This provides meaningful insights for the application of deep learning in medical image segmentation. Our code will be released once the manuscript is accepted for publication.
摘要:
目的:近年来,将深度学习用于医学图像分割已成为一种流行趋势,但其发展也面临一些挑战。首先,由于医疗数据的特殊性,精确注释是耗时且费力的。用有限的标记数据有效地训练神经网络是医学图像分析中的重大挑战。其次,卷积神经网络常用于医学图像分割的研究往往关注图像中的局部特征。然而,复杂解剖结构或不规则病变的识别通常需要局部和全局信息的帮助,这导致了其发展的瓶颈。解决这两个问题,在本文中,我们提出了一种新颖的网络架构。
方法:我们集成了一个移位窗口机制来学习更全面的语义信息,并采用了一种半监督学习策略,方法是结合大量灵活的未标记数据。具体来说,采用典型的U形编码器-解码器结构来获得丰富的特征图。每个编码器被设计为双分支结构,包含Swin模块配备不同大小的窗口来捕获多个尺度的特征。为了有效地利用未标记的数据,引入水平集函数来建立函数回归和像素分类之间的一致性。
结果:我们在COVID-19CT数据集和DRIVE数据集上进行了实验,并将我们的方法与各种半监督和完全监督学习模型进行了比较。在COVID-19CT数据集上,我们取得了高达74.56%的分割准确率。我们在DRIVE数据集上的分割准确率为79.79%。
结论:结果表明我们的方法在几种常用的评估指标上具有出色的性能。我们的模型的高分割精度表明,利用具有不同窗口大小的Swin模块可以增强模型的特征提取能力,并且水平集函数可以使半监督模型更有效地利用未标记数据。这为深度学习在医学图像分割中的应用提供了有意义的见解。一旦手稿被接受出版,我们的代码将被发布。
公众号