关键词: Feature fusion Medical image segmentation Self-attention UNet

Mesh : Humans Image Processing, Computer-Assisted / methods Neural Networks, Computer Algorithms

来  源:   DOI:10.1016/j.compbiomed.2024.108947

Abstract:
Recently, ViT and CNNs based on encoder-decoder architecture have become the dominant model in the field of medical image segmentation. However, there are some deficiencies for each of them: (1) It is difficult for CNNs to capture the interaction between two locations with consideration of the longer distance. (2) ViT cannot acquire the interaction of local context information and carries high computational complexity. To optimize the above deficiencies, we propose a new network for medical image segmentation, which is called FCSU-Net. FCSU-Net uses the proposed collaborative fusion of multi-scale feature block that enables the network to obtain more abundant and more accurate features. In addition, FCSU-Net fuses full-scale feature information through the FFF (Full-scale Feature Fusion) structure instead of simple skip connections, and establishes long-range dependencies on multiple dimensions through the CS (Cross-dimension Self-attention) mechanism. Meantime, every dimension is complementary to each other. Also, CS mechanism has the advantage of convolutions capturing local contextual weights. Finally, FCSU-Net is validated on several datasets, and the results show that FCSU-Net not only has a relatively small number of parameters, but also has a leading segmentation performance.
摘要:
最近,基于编码器-解码器架构的ViT和CNN已经成为医学图像分割领域的主导模型。然而,它们中的每一个都有一些不足:(1)考虑到较长距离,CNN很难捕获两个位置之间的相互作用。(2)ViT无法获取局部上下文信息的交互,计算复杂度高。为了优化上述不足,我们提出了一种新的医学图像分割网络,称为FCSU-Net。FCSU-Net使用提出的多尺度特征块的协作融合,使网络获得更丰富,更准确的特征。此外,FCSU-Net通过FFF(全尺度特征融合)结构融合全尺度特征信息,而不是简单的跳过连接,并通过CS(Cross-dimensionSelf-attention)机制在多个维度上建立长程依赖关系。同时,每个维度都是相辅相成的。此外,CS机制具有卷积捕获局部上下文权重的优点。最后,FCSU-Net在多个数据集上进行了验证,结果表明,FCSU-Net不仅参数数量相对较少,而且还具有领先的细分性能。
公众号