squeeze-and-excitation

挤压和激励
  • 文章类型: Journal Article
    确保机械设备的安全,齿轮箱故障诊断对于整个系统的稳定运行至关重要。然而,现有的诊断方法仍然有局限性,例如对单尺度特征的分析和对全局时间依赖的识别不足。为了解决这些问题,提出了一种基于MSCNN-LSTM-CBAM-SE的齿轮箱故障诊断新方法。CBAM-SE模块的输出与MSCNN的多尺度特征和LSTM的时间特征深度集成,构建全面的特征表示,为故障诊断提供更丰富、更精确的信息。该方法的有效性已通过两组齿轮箱数据集和对该模型的消融研究得到验证。实验结果表明,该模型在精度和F1得分方面都取得了优异的性能,在其他指标中。最后,通过与其他相关故障诊断方法的比较,进一步验证了该模型的优越性。该研究为齿轮箱的准确故障诊断提供了一种新的解决方案。
    Ensuring the safety of mechanical equipment, gearbox fault diagnosis is crucial for the stable operation of the whole system. However, existing diagnostic methods still have limitations, such as the analysis of single-scale features and insufficient recognition of global temporal dependencies. To address these issues, this article proposes a new method for gearbox fault diagnosis based on MSCNN-LSTM-CBAM-SE. The output of the CBAM-SE module is deeply integrated with the multi-scale features from MSCNN and the temporal features from LSTM, constructing a comprehensive feature representation that provides richer and more precise information for fault diagnosis. The effectiveness of this method has been validated with two sets of gearbox datasets and through ablation studies on this model. Experimental results show that the proposed model achieves excellent performance in terms of accuracy and F1 score, among other metrics. Finally, a comparison with other relevant fault diagnosis methods further verifies the advantages of the proposed model. This research offers a new solution for accurate fault diagnosis of gearboxes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    喉癌(LC)代表了一个重大的世界卫生问题,与降低生存率归因于晚期诊断。LC的正确治疗是复杂的,尤其是在最后阶段。这种癌症是患者头颈部区域内的复杂恶性肿瘤。最近,研究人员服务于医疗顾问,以识别LC有效地开发不同的分析方法和工具。然而,这些现有的工具和技术在性能约束方面存在各种问题,比如在早期阶段检测LC的准确性较低,额外的计算复杂性,以及患者筛查中巨大的时间利用率。已经建立了有效识别LC的深度学习(DL)方法。因此,本研究使用混沌元启发式集成与DL(LCD-CMDL)技术开发了一种有效的LC检测。LCD-CMDL技术主要集中于利用喉部区域图像对LC进行检测和分类。在LCD-CMDL技术中,对比度增强过程使用CLAHE方法。对于特征提取,LCD-CMDL技术应用挤压和激励ResNet(SE-ResNet)模型从图像预处理中学习复杂和固有的特征。此外,SE-ResNet方法的超参数调整是使用混沌自适应麻雀搜索算法(CSSA)进行的。最后,应用极限学习机(ELM)模型对LC进行检测和分类。LCD-CMDL方法的性能评估是利用基准喉部区域图像数据库进行的。实验值暗示LCD-CMDL方法优于最近的最先进的方法。
    Laryngeal cancer (LC) represents a substantial world health problem, with diminished survival rates attributed to late-stage diagnoses. Correct treatment for LC is complex, particularly in the final stages. This kind of cancer is a complex malignancy inside the head and neck region of patients. Recently, researchers serving medical consultants to recognize LC efficiently develop different analysis methods and tools. However, these existing tools and techniques have various problems regarding performance constraints, like lesser accuracy in detecting LC at the early stages, additional computational complexity, and colossal time utilization in patient screening. Deep learning (DL) approaches have been established that are effective in the recognition of LC. Therefore, this study develops an efficient LC Detection using the Chaotic Metaheuristics Integration with the DL (LCD-CMDL) technique. The LCD-CMDL technique mainly focuses on detecting and classifying LC utilizing throat region images. In the LCD-CMDL technique, the contrast enhancement process uses the CLAHE approach. For feature extraction, the LCD-CMDL technique applies the Squeeze-and-Excitation ResNet (SE-ResNet) model to learn the complex and intrinsic features from the image preprocessing. Moreover, the hyperparameter tuning of the SE-ResNet approach is performed using a chaotic adaptive sparrow search algorithm (CSSA). Finally, the extreme learning machine (ELM) model was applied to detect and classify the LC. The performance evaluation of the LCD-CMDL approach occurs utilizing a benchmark throat region image database. The experimental values implied the superior performance of the LCD-CMDL approach over recent state-of-the-art approaches.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在隧道掘进机(TBM)施工中,准确的穿透率预测提高了破岩效率并减少了圆盘刀具的损坏。然而,该过程面临着巨大的挑战,例如地面条件的高度不确定性以及在大型和大型隧道中保持最佳TBM操作的复杂性。为了应对这些挑战,我们提出TCN-SENet++,一种新颖的混合多步实时渗透率预测模型,该模型结合了时间卷积网络(TCN)和挤压和激励(SENet)块,用于辅助隧穿。本研究旨在展示TCN-SENet++的应用,以及其他模型,如RNN,LSTM,GRU,和TCN,用于TBM渗透率预测。该模型是使用从Yin-Song引水项目中收集的实际数据集开发的。我们采用30秒的时间步长来预测渗透率的未来时间步长(1st,3rd,5th,Seven,和9th)。影响渗透率的特征,例如刀盘扭矩,推力,和刀盘功率,被考虑。使用平均绝对误差和均方误差进行比较分析,发现TCN-SENet++模型优于其他模型,包括RNN,LSTM,GRU,TCN,和TCN-SENet+。相比之下,TCN-SENet++实现了18%的平均MSE降低,6%,3%,1%,2%,分别。TCN-SENet++模型在新项目中显示出更少的错误,验证其在TBM施工中实时渗透率预测的有效性和适用性。
    Accurate penetration rate prediction enhances rock-breaking efficiency and reduces disc cutter damage in tunnel boring machine (TBM) construction. However, this process faces significant challenges such as the high uncertainty of ground conditions and the complexity of maintaining optimal TBM operation in long and large tunnels. To address these challenges, we propose TCN-SENet++, a novel hybrid multistep real-time penetration rate prediction model that combines a temporal convolutional network (TCN) and a squeeze-and-excitation (SENet) block for aided tunneling. This study aims to demonstrate the application of TCN-SENet++, as well as other models such as RNN, LSTM, GRU, and TCN, for TBM penetration rate prediction. The model was developed using actual datasets collected from the Yin-Song diversion project. We employ a 30-s time step to predict the future time steps of the penetration rate (1st, 3rd, 5th, 7th, and 9th). The features that influence the penetration rate, such as the cutterhead torque, thrust, and cutterhead power, were considered. A comparative analysis using the mean absolute error and mean squared error revealed that the TCN-SENet++ model outperformed the other models, including RNN, LSTM, GRU, TCN, and TCN-SENet+. In comparison, TCN-SENet++ achieved average MSE reductions of 18%, 6%, 3%, 1%, and 2%, respectively. The TCN-SENet++ model demonstrated fewer errors in the new project, validating its effectiveness and suitability for real-time penetration rate prediction in TBM construction.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    视网膜血管的分割结果对于糖尿病性视网膜病变等眼科疾病的自动诊断至关重要,高血压,心脑血管疾病。为了提高血管分割的准确性,更好地提取小血管和边缘信息,我们引入了具有监督注意机制的U-Net算法进行视网膜血管分割。我们通过在编码部分引入解码器融合模块(DFM)来实现这一点,有效地结合不同的卷积块来综合提取特征。此外,在U-Net的解码部分,我们提出了上下文挤压和激励(CSE)解码模块来增强重要的上下文特征信息和微小血管的检测。对于最终输出,我们引入了监督融合机制(SFM),它结合了从浅层到深层的多个分支,有效地融合多尺度特征,捕捉不同层次的信息,充分集成低级和高级功能,以提高分割性能。我们在DRIVE公共数据集上的实验结果,STARE,和CHASED_B1证明了我们提出的网络的优异性能。
    The segmentation results of retinal blood vessels are crucial for automatically diagnosing ophthalmic diseases such as diabetic retinopathy, hypertension, cardiovascular and cerebrovascular diseases. To improve the accuracy of vessel segmentation and better extract information about small vessels and edges, we introduce the U-Net algorithm with a supervised attention mechanism for retinal vessel segmentation. We achieve this by introducing a decoder fusion module (DFM) in the encoding part, effectively combining different convolutional blocks to extract features comprehensively. Additionally, in the decoding part of U-Net, we propose the context squeeze and excitation (CSE) decoding module to enhance important contextual feature information and the detection of tiny blood vessels. For the final output, we introduce the supervised fusion mechanism (SFM), which combines multiple branches from shallow to deep layers, effectively fusing multi-scale features and capturing information from different levels, fully integrating low-level and high-level features to improve segmentation performance. Our experimental results on the public datasets of DRIVE, STARE, and CHASED_B1 demonstrate the excellent performance of our proposed network.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    睡眠分期对于评估睡眠质量和诊断睡眠障碍至关重要。使用脑电图(EEG)信号的深度学习方法的最新进展在自动睡眠分期方面取得了显着成功。然而,使用更深的神经网络可能会导致梯度消失和爆炸的问题,而EEG信号的非平稳性和低信噪比会对特征表示产生负面影响。为了克服这些挑战,我们提出了一种新的轻量级序列到序列深度学习模型,1D-ResNet-SE-LSTM,使用单通道原始EEG信号将睡眠阶段分为五类。我们提出的模型由两个主要部分组成:一个带有挤压和激励模块的一维残差卷积神经网络,用于从脑电信号中提取和重新加权特征,和一个长期短期记忆网络来捕捉睡眠阶段之间的转换规则。此外,我们应用加权交叉熵损失函数来缓解类不平衡问题。我们在两个公开可用的数据集上评估了我们模型的性能;Sleep-EDFExpanded由从78名健康受试者收集的153个过夜PSG记录组成,ISRUC-Sleep包括从100名被诊断患有各种睡眠障碍的受试者收集的100个PSG记录,总体准确率分别为86.39%和81.97%,分别,以及相应的宏观平均F1分数81.95%和79.94%。我们的模型在几个睡眠阶段的整体性能指标和每类F1得分方面优于现有的睡眠分期模型。特别是对于N1阶段,其中F1得分分别为59.00%和55.53%。睡眠-EDF扩展和ISRUC-睡眠数据集的kappa系数为0.812和0.766,分别,表明与经过认证的睡眠专家有很强的一致性。我们还研究了用作模型输入的EEG时期的不同权重系数组合和序列长度对其性能的影响。此外,进行消融研究以评估各组件对模型性能的贡献.结果表明了该模型在睡眠阶段分类方面的有效性和鲁棒性。并强调了其减少人类临床医生工作量的潜力,使睡眠评估和诊断更有效。然而,所提出的模型受到几个限制。首先,该模型是一个序列到序列的网络,这需要EEG时期的输入序列。其次,可以进一步优化损失函数中的权重系数以平衡每个睡眠阶段的分类性能。最后,除了渠道注意力机制之外,纳入更先进的注意力机制可以提高模型的有效性。
    Sleep staging is crucial for assessing sleep quality and diagnosing sleep disorders. Recent advances in deep learning methods with electroencephalogram (EEG) signals have shown remarkable success in automatic sleep staging. However, the use of deeper neural networks may lead to the issues of gradient disappearance and explosion, while the non-stationary nature and low signal-to-noise ratio of EEG signals can negatively impact feature representation. To overcome these challenges, we proposed a novel lightweight sequence-to-sequence deep learning model, 1D-ResNet-SE-LSTM, to classify sleep stages into five classes using single-channel raw EEG signals. Our proposed model consists of two main components: a one-dimensional residual convolutional neural network with a squeeze-and-excitation module to extract and reweight features from EEG signals, and a long short-term memory network to capture the transition rules among sleep stages. In addition, we applied the weighted cross-entropy loss function to alleviate the class imbalance problem. We evaluated the performance of our model on two publicly available datasets; Sleep-EDF Expanded consists of 153 overnight PSG recordings collected from 78 healthy subjects and ISRUC-Sleep includes 100 PSG recordings collected from 100 subjects diagnosed with various sleep disorders, and obtained an overall accuracy rate of 86.39% and 81.97%, respectively, along with corresponding macro average F1-scores of 81.95% and 79.94%. Our model outperforms existing sleep staging models in terms of overall performance metrics and per-class F1-scores for several sleep stages, particularly for the N1 stage, where it achieves F1-scores of 59.00% and 55.53%. The kappa coefficient is 0.812 and 0.766 for the Sleep-EDF Expanded and ISRUC-Sleep datasets, respectively, indicating strong agreement with certified sleep experts. We also investigated the effect of different weight coefficient combinations and sequence lengths of EEG epochs used as input to the model on its performance. Furthermore, the ablation study was conducted to evaluate the contribution of each component to the model\'s performance. The results demonstrate the effectiveness and robustness of the proposed model in classifying sleep stages, and highlights its potential to reduce human clinicians\' workload, making sleep assessment and diagnosis more effective. However, the proposed model is subject to several limitations. Firstly, the model is a sequence-to-sequence network, which requires input sequences of EEG epochs. Secondly, the weight coefficients in the loss function could be further optimized to balance the classification performance of each sleep stage. Finally, apart from the channel attention mechanism, incorporating more advanced attention mechanisms could enhance the model\'s effectiveness.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    图像语义分割是自动驾驶辅助技术的重要组成部分。道路场景的复杂性和应用场景对分割算法的实时性要求是分割算法面临的挑战。为了应对上述挑战,提出了基于解耦动态滤波和挤压激励的深度双分辨率道路场景分割网络(DDF&SE-DDRNet)。所提出的DDF&SE-DDRNet在每个模块中使用解耦动态滤波器来减少网络参数的数量,并使网络能够动态调整每个卷积核的权重。我们在DDF&SE-DDRNet的每个模块中增加了挤压激励模块,使得网络中的局部特征图可以获得全局特征,以减少图像局部干扰对分割结果的影响。在Cityscapes数据集上的实验结果表明,DDF&SE-DDRNet的分割精度比现有算法至少提高2%。此外,DDF&SE-DDRNet也具有令人满意的推断速度。
    Image semantic segmentation is an important part of automatic driving assistance technology. The complexity of road scenes and the real-time requirements of application scenes for segmentation algorithm are the challenges facing segmentation algorithms. In order to meet the above challenges, Deep Dual-resolution Road Scene Segmentation Networks based on Decoupled Dynamic Filter and Squeeze-Excitation (DDF&SE-DDRNet) are proposed in this paper. The proposed DDF&SE-DDRNet uses decoupled dynamic filter in each module to reduce the number of network parameters and enable the network to dynamically adjust the weight of each convolution kernel. We add the Squeeze-and-Excitation module to each module of DDF&SE-DDRNet so that the local feature map in the network can obtain global features to reduce the impact of image local interference on the segmentation result. The experimental results on the Cityscapes dataset show that the segmentation accuracy of DDF&SE-DDRNet is at least 2% higher than that of existing algorithms. Moreover, DDF&SE-DDRNet also has satisfactory inferring speed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:皮肤病变的精确分割是皮肤镜图像分类的关键步骤,这为皮肤科医生诊断皮肤病提供了有力的手段。然而,由于边界模糊,病变与其周围皮肤之间的对比度低,颜色和形状的变化,现有的大多数分割方法在获取感受野和提取图像特征信息方面仍然面临着巨大的挑战。为了解决上述问题,我们构建了一个新的框架,名为SEACU-Net,分析和分割皮肤病变图像。
    方法:受U-Net启发,我们利用密集的卷积块来获得更多的判别信息。然后,在每个编码和解码阶段,每次卷积后设计通道和空间挤压和激发层,自适应增强有用的信息特征,并抑制来自不同特征通道的低价值信息。此外,注意力机制被集成到卷积长短期记忆(ConvLSTM)结构中,提高了灵敏度和预测精度。此外,该网络引入了基于二元交叉熵和Jaccard损失的新颖损失,这可以确保更平衡的分割。
    结果:所提出的方法应用于ISIC2017和2018年公开图像数据库,然后在骰子中获得更好的性能,Jaccard,和准确性,具有89.11%和87.58%的骰子值,80.50%和78.12%Jaccard值,95.01%,和93.60%的准确度值,分别。
    结论:定量和定性实验的结果表明,我们的方法达到了高性能的皮肤病变分割,并可以帮助放射科医生在临床实践中制定放射治疗计划。
    OBJECTIVE: Accurate segmentation of skin lesions is a pivotal step in dermoscopy image classification, which provides a powerful means for dermatologists to diagnose skin diseases. However, due to blurred boundaries, low contrast between the lesion and its surrounding skin, and changes in color and shape, most existing segmentation methods still face great challenges in obtaining receptive fields and extracting image feature information. To settle the above issues, we construct a new framework, named SEACU-Net, to analyze and segment skin lesion images.
    METHODS: Inspired by the U-Net, we utilize dense convolution blocks to obtain more discriminative information. Then, at each encoding and decoding stage, a channel and spatial squeeze & excitation layer are designed after each convolution, to adaptively enhance useful information features and suppress low-value ones from different feature channels. In addition, the attention mechanism is integrated into the convolutional long short-term memory (ConvLSTM) structure, which improves sensitivity and prediction accuracy. Furthermore, this network introduces a novel loss based on binary cross-entropy and Jaccard losses, which can ensure more balanced segmentation.
    RESULTS: The proposed method is applied to the ISIC 2017 and 2018 publicly image databases, then obtains a better performance in Dice, Jaccard, and Accuracy, with 89.11% and 87.58% Dice value, 80.50% and 78.12% Jaccard value, 95.01%, and 93.60% Accuracy value, respectively.
    CONCLUSIONS: The results of quantitative and qualitative experiments show that our method reaches high-performance skin lesion segmentation, and can help radiologists make radiotherapy treatment plans in clinical practice.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:肺癌是全球发病率和死亡率最高的疾病之一。从CT图像中自动分割肺部肿瘤具有重要意义。然而,细分面临着几个挑战,包括可变的形状和不同的大小,以及复杂的周围组织。
    方法:我们提出了一种具有条件随机场(M-SegSEUNet-CRF)的多尺度分割挤压和激励UNet,以自动从CT图像中分割肺肿瘤。M-SegSEUNet-CRF采用多尺度策略来解决肿瘤大小可变的问题。通过空间适应性注意机制,嵌入在3DUNet中的分割SE块用于突出肿瘤区域。进一步添加密集连接的CRF框架以在详细水平上描绘肿瘤边界。总的来说,759例肺癌患者的CT扫描用于训练和评估M-SegSEUNet-CRF模型(456用于训练,152用于验证,和151用于测试)。同时,公共NSCLC-Radiomics和LIDC数据集已被用于验证所提出方法的推广。通过消融实验分析了不同模块在M-SegSEUNet-CRF模型中的作用,并将其性能与UNet的性能进行比较,它的变体和其他最先进的模型。
    结果:M-SegSEUNet-CRF可以实现0.851±0.071的Dice系数,0.747±0.102的交集(IoU),0.827±0.108的灵敏度和阳性预测值(PPV)为0.900±0.107。如果没有多尺度战略,骰子系数下降到0.820±0.115;没有CRF,它下降到0.842±0.082,没有两者,下降到0.806±0.120。M-SegSEUNet-CRF比3DUNet(0.782±0.115)及其变体(ResUNet,0.797±0.132;DenseUNet,0.792±0.111,和UNETR,0.794±0.130)。虽然随着肿瘤体积的减小,性能略有下降,M-SegSEUNet-CRF比其他比较模型具有更明显的优势。
    结论:我们的M-SegSEUNet-CRF模型通过多尺度策略和空间自适应注意机制提高了UNet的分割能力。CRF能够更精确地描绘肿瘤边界。M-SegSEUNet-CRF模型集成了这些特征,并在肺肿瘤分割任务中表现出出色的性能。它可以进一步扩展以处理医学成像领域中的其他分割问题。
    OBJECTIVE: Lung cancer counts among diseases with the highest global morbidity and mortality rates. The automatic segmentation of lung tumors from CT images is of vast significance. However, the segmentation faces several challenges, including variable shapes and different sizes, as well as complicated surrounding tissues.
    METHODS: We propose a multi-scale segmentation squeeze-and-excitation UNet with a conditional random field (M-SegSEUNet-CRF) to automatically segment lung tumors from CT images. M-SegSEUNet-CRF employs a multi-scale strategy to solve the problem of variable tumor size. Through the spatially adaptive attention mechanism, the segmentation SE blocks embedded in 3D UNet are utilized to highlight tumor regions. The dense connected CRF framework is further added to delineate tumor boundaries at a detailed level. In total, 759 CT scans of patients with lung cancer were used to train and evaluate the M-SegSEUNet-CRF model (456 for training, 152 for validation, and 151 for test). Meanwhile, the public NSCLC-Radiomics and LIDC datasets have been utilized to validate the generalization of the proposed method. The role of different modules in the M-SegSEUNet-CRF model is analyzed by the ablation experiments, and the performance is compared with that of UNet, its variants and other state-of-the-art models.
    RESULTS: M-SegSEUNet-CRF can achieve a Dice coefficient of 0.851 ± 0.071, intersection over union (IoU) of 0.747 ± 0.102, sensitivity of 0.827 ± 0.108, and positive predictive value (PPV) of 0.900 ± 0.107. Without a multi-scale strategy, the Dice coefficient drops to 0.820 ± 0.115; without CRF, it drops to 0.842 ± 0.082, and without both, it drops to 0.806 ± 0.120. M-SegSEUNet-CRF presented a higher Dice coefficient than 3D UNet (0.782 ± 0.115) and its variants (ResUNet, 0.797 ± 0.132; DenseUNet, 0.792 ± 0.111, and UNETR, 0.794 ± 0.130). Although the performance slightly declines with the decrease in tumor volume, M-SegSEUNet-CRF exhibits more obvious advantages than the other comparative models.
    CONCLUSIONS: Our M-SegSEUNet-CRF model improves the segmentation ability of UNet through the multi-scale strategy and spatially adaptive attention mechanism. The CRF enables a more precise delineation of tumor boundaries. The M-SegSEUNet-CRF model integrates these characteristics and demonstrates outstanding performance in the task of lung tumor segmentation. It can furthermore be extended to deal with other segmentation problems in the medical imaging field.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Objective.Spatial and spectral features extracted from electroencephalogram (EEG) are critical for the classification of motor imagery (MI) tasks. As prevalently used methods, the common spatial pattern (CSP) and filter bank CSP (FBCSP) can effectively extract spatial-spectral features from MI-related EEG. To further improve the separability of the CSP features, we proposed a distinguishable spatial-spectral feature learning neural network (DSSFLNN) framework for MI-based brain-computer interfaces (BCIs) in this study.Approach.The first step of the DSSFLNN framework was to extract FBCSP features from raw EEG signals. Then two squeeze-and-excitation modules were used to re-calibrate CSP features along the band-wise axis and the class-wise axis, respectively. Next, we used a parallel convolutional neural network module to learn distinguishable spatial-spectral features. Finally, the distinguishable spatial-spectral features were fed to a fully connected layer for classification. To verify the effectiveness of the proposed framework, we compared it with the state-of-the-art methods on BCI competition IV datasets 2a and 2b.Main results.The results showed that the DSSFLNN framework can achieve a mean Cohen\'s kappa value of 0.7 on two datasets, which outperformed the state-of-the-art methods. Moreover, two additional experiments were conducted and they proved that the combination of band-wise feature learning and class-wise feature learning can achieve significantly better performance than only using either one of them.Significance.The proposed DSSFLNN can effectively improve the decoding performance of MI-based BCIs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    OBJECTIVE: Liver segmentation is an essential prerequisite for liver cancer diagnosis and surgical planning. Traditionally, liver contour is delineated manually by radiologist in a slice-by-slice fashion. However, this process is time-consuming and prone to errors depending on radiologist\'s experience. In this paper, a modified U-Net based framework is presented, which leverages techniques from Squeeze-and-Excitation (SE) block, Atrous Spatial Pyramid Pooling (ASPP) and residual learning for accurate and robust liver Computed Tomography (CT) segmentation, and the effectiveness of the proposed method was tested on two public datasets LiTS17 and SLiver07.
    METHODS: A new network architecture, called SAR-U-Net was designed, which is grounded in the classical U-Net. Firstly, the SE block is introduced to adaptively extract image features after each convolution in the U-Net encoder, while suppressing irrelevant regions, and highlighting features of specific segmentation task; Secondly, the ASPP is employed to replace the transition layer and the output layer, and acquire multi-scale image information via different receptive fields. Thirdly, to alleviate the gradient vanishment problem, the traditional convolution block is replaced with the residual structures, and thus prompt the network to gain accuracy from considerably increased depth.
    RESULTS: In the LiTS17 database experiment, five popular metrics were used for evaluation, including Dice coefficient, VOE, RVD, ASD and MSD. Compared with other closely related models, the proposed method achieved the highest accuracy. In addition, in the experiment of the SLiver07 dataset, compared with other closely related models, the proposed method achieved the highest segmentation accuracy except for the RVD.
    CONCLUSIONS: An improved U-Net network combining SE, ASPP, and residual structures is developed for automatic liver segmentation from CT images. This new model shows a great improvement on the accuracy compared to other closely related models, and its robustness to challenging problems, including small liver regions, discontinuous liver regions, and fuzzy liver boundaries, is also well demonstrated and validated.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号