关键词: Channel attention Convolutional neural network Medical image segmentation Self-attention Spatial attention

Mesh : Neural Networks, Computer Image Processing, Computer-Assisted

来  源:   DOI:10.1016/j.compbiomed.2024.108265

Abstract:
Convolution operation is performed within a local window of the input image. Therefore, convolutional neural network (CNN) is skilled in obtaining local information. Meanwhile, the self-attention (SA) mechanism extracts features by calculating the correlation between tokens from all positions in the image, which has advantage in obtaining global information. Therefore, the two modules can complement each other to improve feature extraction ability. An effective fusion method is a problem worthy of further study. In this paper, we propose a CNN and SA paralleling network CSAP-UNet with U-Net as backbone. The encoder consists of two parallel branches of CNN and Transformer to extract the feature from the input image, which takes into account both the global dependencies and the local information. Because medical images come from certain frequency bands within the spectrum, their color channels are not as uniform as natural images. Meanwhile, medical segmentation pays more attention to lesion regions in the image. Attention fusion module (AFM) integrates channel attention and spatial attention in series to fuse the output features of the two branches. The medical image segmentation task is essentially to locate the boundary of the object in the image. The boundary enhancement module (BEM) is designed in the shallow layer of the proposed network to focus more specifically on pixel-level edge details. Experimental results on three public datasets validate that CSAP-UNet outperforms state-of-the-art networks, particularly on the ISIC 2017 dataset. The cross-dataset evaluation on Kvasir and CVC-ClinicDB shows that CSAP-UNet has strong generalization ability. Ablation experiments also indicate the effectiveness of the designed modules. The code for training and test is available at https://github.com/zhouzhou1201/CSAP-UNet.git.
摘要:
在输入图像的局部窗口内执行卷积操作。因此,卷积神经网络(CNN)擅长获取局部信息。同时,自注意(SA)机制通过计算图像中所有位置的标记之间的相关性来提取特征,这在获取全球信息方面具有优势。因此,这两个模块可以相互补充,提高特征提取能力。一种有效的融合方法是一个值得深入研究的问题。在本文中,我们提出了以U-Net为骨干的CNN和SA并行网络CSAP-UNet。编码器由CNN和Transformer两个并行分支组成,用于从输入图像中提取特征,它考虑了全局依赖关系和本地信息。因为医学图像来自频谱中的某些频带,它们的颜色通道不像自然图像那么均匀。同时,医学分割更加关注图像中的病变区域。注意力融合模块(AFM)将通道注意力和空间注意力串联起来,融合两个分支的输出特征。医学图像分割任务实质上是定位图像中对象的边界。边界增强模块(BEM)设计在所提出的网络的浅层中,以更具体地关注像素级边缘细节。在三个公共数据集上的实验结果验证了CSAP-UNet优于最先进的网络,特别是在ISIC2017数据集上。在Kvasir和CVC-ClinicDB上的跨数据集评估表明,CSAP-UNet具有很强的泛化能力。消融实验也表明了所设计模块的有效性。培训和测试代码可在https://github.com/zhouzhou1201/CSAP-UNet获得。git.
公众号