convolutional block attention module

卷积块注意模块
  • 文章类型: Journal Article
    在生物信息学中,仅根据其氨基酸序列了解蛋白质的功能是一项至关重要但复杂的任务。传统上,事实证明,这一挑战是困难的。然而,近年来见证了深度学习作为一种强大工具的兴起,在蛋白质功能预测方面取得了显著成功。他们的优势在于他们能够自动从蛋白质序列中学习信息特征,然后可以用来预测蛋白质的功能。这项研究建立在这些进步的基础上,提出了一个新的模型:CNN-CBAM+BiGRU。它包含一个卷积块注意模块(CBAM)与BiGRU。CBAM充当聚光灯,指导CNN专注于蛋白质数据中信息最丰富的部分,导致更准确的特征提取。BiGRU,一种循环神经网络(RNN),擅长捕捉蛋白质序列中的远程依赖关系,这对于准确的函数预测至关重要。所提出的模型整合了CNN-CBAM和BiGRU的优势。这项研究的发现,通过实验验证,展示这种组合方法的有效性。对于人类数据集,对于细胞成分,建议的方法优于CNN-BIGRU+ATT模型+1.0%,+1.1%的分子功能,生物过程+0.5%。对于酵母数据集,对于细胞成分,建议的方法优于CNN-BIGRU+ATT模型+2.4%,+1.2%的分子功能,生物过程+0.6%。
    Understanding a protein\'s function based solely on its amino acid sequence is a crucial but intricate task in bioinformatics. Traditionally, this challenge has proven difficult. However, recent years have witnessed the rise of deep learning as a powerful tool, achieving significant success in protein function prediction. Their strength lies in their ability to automatically learn informative features from protein sequences, which can then be used to predict the protein\'s function. This study builds upon these advancements by proposing a novel model: CNN-CBAM+BiGRU. It incorporates a Convolutional Block Attention Module (CBAM) alongside BiGRUs. CBAM acts as a spotlight, guiding the CNN to focus on the most informative parts of the protein data, leading to more accurate feature extraction. BiGRUs, a type of Recurrent Neural Network (RNN), excel at capturing long-range dependencies within the protein sequence, which are essential for accurate function prediction. The proposed model integrates the strengths of both CNN-CBAM and BiGRU. This study\'s findings, validated through experimentation, showcase the effectiveness of this combined approach. For the human dataset, the suggested method outperforms the CNN-BIGRU+ATT model by +1.0 % for cellular components, +1.1 % for molecular functions, and +0.5 % for biological processes. For the yeast dataset, the suggested method outperforms the CNN-BIGRU+ATT model by +2.4 % for the cellular component, +1.2 % for molecular functions, and +0.6 % for biological processes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    及时准确的癫痫发作检测对癫痫患者的诊断和治疗具有重要意义。现有的癫痫发作检测模型通常复杂且耗时,突出了轻量级癫痫检测的迫切需要。此外,现有的方法往往忽略了脑电图(EEG)信号的关键特征通道和空间区域。为了解决这些问题,我们提出了一种轻量级的基于脑电图的癫痫发作检测模型,称为轻量级的反向残余注意网络(LRAN)。具体来说,我们采用四级倒置残差移动块(iRMB)有效地从脑电图中提取层次特征.引入卷积块注意力模块(CBAM),使模型集中于重要的特征通道和空间信息,从而增强对学习特征的辨别。最后,卷积运算用于捕获局部信息和特征之间的空间关系。我们在公开可用的数据集上进行受试者内和受试者间的实验。受试者内部实验在基于片段的检测中获得99.25%的准确率,在基于事件的检测中获得0.36/h的误检率(FDR),分别。受试者间实验获得84.32%的准确率。两组实验都以较低的参数数量保持较高的分类精度,其中乘法累加运算(MAC)为25.86[公式:请参见文本]M,参数数量为0.57[公式:请参见文本]M。
    Timely and accurately seizure detection is of great importance for the diagnosis and treatment of epilepsy patients. Existing seizure detection models are often complex and time-consuming, highlighting the urgent need for lightweight seizure detection. Additionally, existing methods often neglect the key characteristic channels and spatial regions of electroencephalography (EEG) signals. To solve these issues, we propose a lightweight EEG-based seizure detection model named lightweight inverted residual attention network (LRAN). Specifically, we employ a four-stage inverted residual mobile block (iRMB) to effectively extract the hierarchical features from EEG. The convolutional block attention module (CBAM) is introduced to make the model focus on important feature channels and spatial information, thereby enhancing the discrimination of the learned features. Finally, convolution operations are used to capture local information and spatial relationships between features. We conduct intra-subject and inter-subject experiments on a publicly available dataset. Intra-subject experiments obtain 99.25% accuracy in segment-based detection and 0.36/h false detection rate (FDR) in event-based detection, respectively. Inter-subject experiments obtain 84.32% accuracy. Both sets of experiments maintain high classification accuracy with a low number of parameters, where the multiply accumulate operations (MACs) are 25.86[Formula: see text]M and the number of parameters is 0.57[Formula: see text]M.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:糖尿病视网膜病变(DR)的分类旨在利用图像中的隐含信息进行早期诊断,以防止和减轻病情的进一步恶化。然而,现有的方法通常受到需要在大范围内操作的限制,带注释的数据集显示出显著的优势。此外,数据集中不同类别的样本数量需要均匀分布,因为样本不平衡分布的特征会导致对高频疾病类别的过度关注,而忽略了较不常见但同样重要的疾病类别。因此,迫切需要开发一种能够有效缓解样本分布不平衡问题的新分类方法,从而提高糖尿病视网膜病变分类的准确性。
    方法:在这项工作中,我们提议MediDRNet,基于原型对比学习的双分支网络模型。该模型采用原型对比学习,为不同程度的病变创建原型,确保它们代表每个病变级别的核心特征。它通过比较数据点及其类别原型之间的相似性来进行分类。我们的双分支网络结构通过强调视网膜病变的细微差异,有效解决了类别不平衡的问题,并提高了分类准确性。此外,我们的方法将双分支网络与特定病变级原型相结合,用于核心特征表示,并结合卷积块注意模块,用于增强病变特征识别.
    结果:我们使用Kaggle和UWF分类数据集进行的实验表明,与业内其他高级模型相比,MediDRNet表现出卓越的性能。特别是在UWFDR分类数据集上,它在所有指标上都实现了最先进的性能。在KaggleDR分类数据集上,它达到了最高的平均分类精度(0.6327)和Macro-F1得分(0.6361)。特别是在Kaggle数据集(1、2、3和4级)上的少数类别糖尿病视网膜病变的分类任务中,该模型达到了58.08%的高分类精度,55.32%,69.73%,和90.21%,分别。在消融研究中,与其他特征提取方法相比,MediDRNet模型在糖尿病视网膜眼底图像的特征提取中被证明更有效。
    结论:本研究采用原型对比学习和双向分支学习策略,在不平衡的糖尿病性视网膜病变数据集中成功构建了糖尿病性视网膜病变分级系统。通过双分支网络,特征学习分支有效地促进了特征从分级网络到分类学习分支的平稳过渡,准确识别少数民族样本类别。该方法不仅有效解决了样本失衡的问题,而且为临床应用中糖尿病视网膜病变的精确分级和早期诊断提供了有力支持。展示了在处理复杂的糖尿病视网膜病变数据集时的卓越性能。此外,这项研究显著提高了医疗实践中糖尿病视网膜病变患者疾病进展的预防和管理效率.我们鼓励使用和修改我们的代码,可在GitHub上公开访问:https://github.com/ReinforceLove/MediDRNet。
    OBJECTIVE: The classification of diabetic retinopathy (DR) aims to utilize the implicit information in images for early diagnosis, to prevent and mitigate the further worsening of the condition. However, existing methods are often limited by the need to operate within large, annotated datasets to show significant advantages. Additionally, the number of samples for different categories within the dataset needs to be evenly distributed, because the characteristic of sample imbalance distribution can lead to an excessive focus on high-frequency disease categories, while neglecting the less common but equally important disease categories. Therefore, there is an urgent need to develop a new classification method that can effectively alleviate the issue of sample distribution imbalance, thereby enhancing the accuracy of diabetic retinopathy classification.
    METHODS: In this work, we propose MediDRNet, a dual-branch network model based on prototypical contrastive learning. This model adopts prototype contrastive learning, creating prototypes for different levels of lesions, ensuring they represent the core features of each lesion level. It classifies by comparing the similarity between data points and their category prototypes. Our dual-branch network structure effectively resolves the issue of category imbalance and improves classification accuracy by emphasizing subtle differences in retinal lesions. Moreover, our approach combines a dual-branch network with specific lesion-level prototypes for core feature representation and incorporates the convolutional block attention module for enhanced lesion feature identification.
    RESULTS: Our experiments using both the Kaggle and UWF classification datasets have demonstrated that MediDRNet exhibits exceptional performance compared to other advanced models in the industry, especially on the UWF DR classification dataset where it achieved state-of-the-art performance across all metrics. On the Kaggle DR classification dataset, it achieved the highest average classification accuracy (0.6327) and Macro-F1 score (0.6361). Particularly in the classification tasks for minority categories of diabetic retinopathy on the Kaggle dataset (Grades 1, 2, 3, and 4), the model reached high classification accuracies of 58.08%, 55.32%, 69.73%, and 90.21%, respectively. In the ablation study, the MediDRNet model proved to be more effective in feature extraction from diabetic retinal fundus images compared to other feature extraction methods.
    CONCLUSIONS: This study employed prototype contrastive learning and bidirectional branch learning strategies, successfully constructing a grading system for diabetic retinopathy lesions within imbalanced diabetic retinopathy datasets. Through a dual-branch network, the feature learning branch effectively facilitated a smooth transition of features from the grading network to the classification learning branch, accurately identifying minority sample categories. This method not only effectively resolved the issue of sample imbalance but also provided strong support for the precise grading and early diagnosis of diabetic retinopathy in clinical applications, showcasing exceptional performance in handling complex diabetic retinopathy datasets. Moreover, this research significantly improved the efficiency of prevention and management of disease progression in diabetic retinopathy patients within medical practice. We encourage the use and modification of our code, which is publicly accessible on GitHub: https://github.com/ReinforceLove/MediDRNet.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    U-Net has demonstrated strong performance in the field of medical image segmentation and has been adapted into various variants to cater to a wide range of applications. However, these variants primarily focus on enhancing the model\'s feature extraction capabilities, often resulting in increased parameters and floating point operations (Flops). In this paper, we propose GA-UNet (Ghost and Attention U-Net), a lightweight U-Net for medical image segmentation. GA-UNet consists mainly of lightweight GhostV2 bottlenecks that reduce redundant information and Convolutional Block Attention Modules that capture key features. We evaluate our model on four datasets, including CVC-ClinicDB, 2018 Data Science Bowl, ISIC-2018, and BraTS 2018 low-grade gliomas (LGG). Experimental results show that GA-UNet outperforms other state-of-the-art (SOTA) models, achieving an F1-score of 0.934 and a mean Intersection over Union (mIoU) of 0.882 on CVC-ClinicDB, an F1-score of 0.922 and a mIoU of 0.860 on the 2018 Data Science Bowl, an F1-score of 0.896 and a mIoU of 0.825 on ISIC-2018, and an F1-score of 0.896 and a mIoU of 0.853 on BraTS 2018 LGG. Additionally, GA-UNet has fewer parameters (2.18M) and lower Flops (4.45G) than other SOTA models, which further demonstrates the superiority of our model.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    乳腺炎是最主要的疾病之一,对全球牧场产品产生负面影响。它减少了牛奶产量,损害牛奶质量,增加治疗费用,甚至导致动物过早被淘汰。此外,不及时采取有效措施将导致疾病蔓延。减少乳腺炎造成的损失的关键在于疾病的早期发现。具有强大特征提取能力的深度学习在医学领域的应用日益受到重视。本研究的主要目的是基于271只水牛乳房的3054张超声图像,建立水牛四分之一级乳腺炎检测的深度学习网络。生成两个数据集,其中体细胞计数(SCC)的阈值设置为2×105个细胞/mL和4×105个细胞/mL,分别。SCC小于阈值的乳房被定义为健康乳房,还有乳腺炎的乳房。将3054张乳房超声图像随机分为一个训练集(70%),验证集(15%),和一个测试集(15%)。我们使用具有强大学习能力的EfficientNet_b3模型与卷积块注意力模块(CBAM)相结合来训练乳腺炎检测模型。为了解决样本类别不平衡的问题,使用PolyLoss模块作为损失函数。利用训练集和验证集建立乳腺炎检测模型,测试集用于评估网络的性能。结果表明,当SCC阈值为2×105细胞/mL时,我们建立的网络表现出70.02%的准确率,特异性为77.93%,灵敏度为63.11%,并且在测试集上的接收器操作特征曲线下的面积(AUC)为0.77。SCC阈值为4×105细胞/mL时,模型的分类效果优于SCC阈值为2×105细胞/mL时。因此,当SCC≥4×105细胞/mL被定义为乳腺炎时,我们建立的深度神经网络被确定为最适合农场现场乳腺炎检测的模型,该网络模型的准确率为75.93%,特异性为80.23%,灵敏度为70.35%,和AUC0.83在测试设置。本研究建立了1/4级乳腺炎检测模型,为发展中国家缺乏乳腺炎诊断条件的小农养殖水牛的乳腺炎检测提供了理论依据。
    Mastitis is one of the most predominant diseases with a negative impact on ranch products worldwide. It reduces milk production, damages milk quality, increases treatment costs, and even leads to the premature elimination of animals. In addition, failure to take effective measures in time will lead to widespread disease. The key to reducing the losses caused by mastitis lies in the early detection of the disease. The application of deep learning with powerful feature extraction capability in the medical field is receiving increasing attention. The main purpose of this study was to establish a deep learning network for buffalo quarter-level mastitis detection based on 3054 ultrasound images of udders from 271 buffaloes. Two data sets were generated with thresholds of somatic cell count (SCC) set as 2 × 105 cells/mL and 4 × 105 cells/mL, respectively. The udders with SCCs less than the threshold value were defined as healthy udders, and otherwise as mastitis-stricken udders. A total of 3054 udder ultrasound images were randomly divided into a training set (70%), a validation set (15%), and a test set (15%). We used the EfficientNet_b3 model with powerful learning capabilities in combination with the convolutional block attention module (CBAM) to train the mastitis detection model. To solve the problem of sample category imbalance, the PolyLoss module was used as the loss function. The training set and validation set were used to develop the mastitis detection model, and the test set was used to evaluate the network\'s performance. The results showed that, when the SCC threshold was 2 × 105 cells/mL, our established network exhibited an accuracy of 70.02%, a specificity of 77.93%, a sensitivity of 63.11%, and an area under the receiver operating characteristics curve (AUC) of 0.77 on the test set. The classification effect of the model was better when the SCC threshold was 4 × 105 cells/mL than when the SCC threshold was 2 × 105 cells/mL. Therefore, when SCC ≥ 4 × 105 cells/mL was defined as mastitis, our established deep neural network was determined as the most suitable model for farm on-site mastitis detection, and this network model exhibited an accuracy of 75.93%, a specificity of 80.23%, a sensitivity of 70.35%, and AUC 0.83 on the test set. This study established a 1/4 level mastitis detection model which provides a theoretical basis for mastitis detection in buffaloes mostly raised by small farmers lacking mastitis diagnostic conditions in developing countries.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    路面容易受到自然灾害的破坏,事故和其他人为因素,导致裂缝的形成。定期进行路面监测,便于及时发现和修复路面病害,从而最大限度地减少人员伤亡和财产损失。由于众多干扰的存在,识别复杂环境下的公路路面裂缝提出了重大挑战。然而,几种计算机视觉方法在解决这一问题方面取得了显著成功。我们采用了一种新颖的方法,利用带有卷积块注意模块(CBAM)的ResNet34模型进行裂缝识别,这不仅可以节省参数和计算能力,还可以确保模块作为插件的无缝集成。最初,ResNet18、ResNet34和ResNet50模型通过使用迁移学习技术进行了训练,ResNet34网络被选为基础模型。随后,将CBAM整合到ResBlock中,并进行了进一步的培训。最后,我们计算了精度,测试集的平均召回率,和每个班级的回忆。结果表明,通过将CBAM集成到ResNet34网络中,与以前的状态相比,该模型表现出提高的测试准确性和平均召回率。此外,我们提出的模型在性能方面优于所有其他模型。横向裂纹的召回率,纵向裂纹,地图裂缝,修复,路面标线占88.8%,86.8%,88.5%,98.3%,99.9%,分别。我们的模型实现了92.9%的最高精度和92.5%的最高平均召回率。然而,发现检测网格裂缝的有效性不令人满意,尽管它们的患病率很高。总之,所提出的模型具有很大的裂缝识别潜力,并作为公路养护的重要基础。
    The pavement is vulnerable to damage from natural disasters, accidents and other human factors, resulting in the formation of cracks. Periodic pavement monitoring can facilitate prompt detection and repair the pavement diseases, thereby minimizing casualties and property losses. Due to the presence of numerous interferences, recognizing highway pavement cracks in complex environments poses a significant challenge. Nevertheless, several computer vision approaches have demonstrated notable success in tackling this issue. We have employed a novel approach for crack recognition utilizing the ResNet34 model with a convolutional block attention module (CBAM), which not only saves parameters and computing power but also ensures seamless integration of the module as a plug-in. Initially, ResNet18, ResNet34, and ResNet50 models were trained by employing transfer learning techniques, with the ResNet34 network being selected as a fundamental model. Subsequently, CBAM was integrated into ResBlock and further training was conducted. Finally, we calculated the precision, average recall on the test set, and the recall of each class. The results demonstrate that by integrating CBAM into the ResNet34 network, the model exhibited improved test accuracy and average recall compared to its previous state. Moreover, our proposed model outperformed all other models in terms of performance. The recall rates for transverse crack, longitudinal crack, map crack, repairing, and pavement marking were 88.8%, 86.8%, 88.5%, 98.3%, and 99.9%, respectively. Our model achieves the highest precision of 92.9% and the highest average recall of 92.5%. However, the effectiveness in detecting mesh cracks was found to be unsatisfactory, despite their significant prevalence. In summary, the proposed model exhibits great potential for crack identification and serves as a crucial foundation for highway maintenance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目标。心电图(ECG)分类算法的分类性能容易受到数据不均衡的影响,这通常会导致几类的模型预测性能不佳,从而导致模型的整体性能下降。方法。为了解决这个问题,本文提出了一种基于生成对抗网络(GAN)的ECG数据增强方法,该方法结合了双向长短期记忆(BiLSTM)网络和卷积块注意机制(CBAM),以提高ECG分类模型的整体性能。在本文中,我们使用了两个心电图数据库,即MIT-BIH心律失常(MIT-BIH-AR)数据库和中国心血管疾病数据库(CCDD)。使用相对差异百分比(PRD)评估由生成的模型产生的ECG信号的质量,均方根误差(RMSE),Frechet距离(FD),动态时间扭曲(DTW),和皮尔逊相关性(PC)指标。此外,我们还在MIT-BIH-AR数据库和CCDD上验证了我们提出的数据增强方法对ECG分类性能的影响.主要结果。在MIT-BIH-AR数据库上,对于15种类型的心跳分类任务,数据增强型平衡数据集的总体准确率提高至99.46%.在CCDD上,专注于检测心室进动(PVC),进行数据增强后,PVC检测的整体准确率提高到99.15%。重要结论。实验结果表明,本文提出的数据增强方法可以进一步提高ECG分类性能。
    Objective.The classification performance of electrocardiogram (ECG) classification algorithms is easily affected by data imbalance, which often leads to poor model prediction performance for a few classes and a consequent decrease in the overall performance of the model.Approach.To address this problem, this paper proposed an ECG data augmentation method based on a generative adversarial network (GAN) that combines bidirectional long short-term memory (Bi-LSTM) networks and convolutional block attention mechanism (CBAM) to improve the overall performance of ECG classification models. In this paper, we used two ECG databases, namely the MIT-BIH arrhythmia (MIT-BIH-AR) database and the Chinese cardiovascular disease database (CCDD). The quality of the ECG signals produced by the generated models was assessed using the percent relative difference, root mean square error, Frechet distance, dynamic time warping (DTW), and Pearson correlation metrics. In addition, we also validated the impact of our proposed data augmentation method on ECG classification performance on MIT-BIH-AR database and CCDD.Main results.On the MIT-BIH-AR database, the overall accuracy of the data-enhanced balanced dataset was improved to 99.46% for 15 types of heartbeat classification task. On the CCDD, which focuses on the detection of ventricular precession (PVC), the overall accuracy of PVC detection improved to 99.15% after performing data enhancement.Significance.The experimental results indicate that the data augmentation method proposed in this paper can further improve the ECG classification performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    With the recent rise in violent crime, the real-time situation analysis capabilities of the prevalent closed-circuit television have been employed for the deterrence and resolution of criminal activities. Anomaly detection can identify abnormal instances such as violence within the patterns of a specified dataset; however, it faces challenges in that the dataset for abnormal situations is smaller than that for normal situations. Herein, using datasets such as UBI-Fights, RWF-2000, and UCSD Ped1 and Ped2, anomaly detection was approached as a binary classification problem. Frames extracted from each video with annotation were reconstructed into a limited number of images of 3×3, 4×3, 4×4, 5×3 sizes using the method proposed in this paper, forming an input data structure similar to a light field and patch of vision transformer. The model was constructed by applying a convolutional block attention module that included channel and spatial attention modules to a residual neural network with depths of 10, 18, 34, and 50 in the form of a three-dimensional convolution. The proposed model performed better than existing models in detecting abnormal behavior such as violent acts in videos. For instance, with the undersampled UBI-Fights dataset, our network achieved an accuracy of 0.9933, a loss value of 0.0010, an area under the curve of 0.9973, and an equal error rate of 0.0027. These results may contribute significantly to solve real-world issues such as the detection of violent behavior in artificial intelligence systems using computer vision and real-time video monitoring.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    绝缘子缺陷检测对输电线路的稳定性具有重要意义。最先进的物体检测网络,YOLOv5,已广泛应用于绝缘子和缺陷检测。然而,YOLOv5网络在检测微小绝缘子缺陷时存在检测率差、计算负荷大等局限性。为了解决这些问题,我们提出了一种用于绝缘子和缺陷检测的轻型网络。在这个网络中,我们在YOLOv5主干和颈部中引入了Ghost模块,以减少参数和模型尺寸,从而增强无人机(UAV)的性能。此外,我们增加了小目标检测锚和层小缺陷检测。此外,我们通过应用卷积块注意模块(CBAM)来专注于绝缘体和缺陷检测的关键信息并抑制非关键信息,从而优化了YOLOv5的主干。实验结果表明,平均精度(mAP)设置为0.5,mAP设置为0.5到0.95,可以达到99.4%和91.7%。参数和模型大小分别减少到3,807,372和8.79M,它可以很容易地部署到嵌入式设备,如无人机。此外,检测速度可达10.9ms/图像,能满足实时检测的要求。
    Insulator defect detection is of great significance to compromise the stability of the power transmission line. The state-of-the-art object detection network, YOLOv5, has been widely used in insulator and defect detection. However, the YOLOv5 network has limitations such as poor detection rate and high computational loads in detecting small insulator defects. To solve these problems, we proposed a light-weight network for insulator and defect detection. In this network, we introduced the Ghost module into the YOLOv5 backbone and neck to reduce the parameters and model size to enhance the performance of unmanned aerial vehicles (UAVs). Besides, we added small object detection anchors and layers for small defect detection. In addition, we optimized the backbone of YOLOv5 by applying convolutional block attention modules (CBAM) to focus on critical information for insulator and defect detection and suppress uncritical information. The experiment result shows the mean average precision (mAP) is set to 0.5, and the mAP is set from 0.5 to 0.95 of our model and can reach 99.4% and 91.7%; the parameters and model size were reduced to 3,807,372 and 8.79 M, which can be easily deployed to embedded devices such as UAVs. Moreover, the speed of detection can reach 10.9 ms/image, which can meet the real-time detection requirement.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    从图像中对苹果进行语义分割对苹果产业的自动化具有重要作用。然而,现有的FCN和UNet等语义分割方法对复杂背景或腐烂部位的苹果图像分割存在速度和精度低的缺点。针对这些问题,基于深度学习的网络分割模型,DeepMDSCBA,是本文提出的。该模型基于DeepLabV3+结构,并且在编码器中使用了轻量级的MobileNet模块来提取特征,这可以减少参数计算量和内存要求。而不是普通的卷积,DeepMDSCBA中使用深度可分离卷积来减少参数的数量以提高计算速度。在DeepMDSCBA的特征提取模块和腔空间金字塔池化模块中,a为了减少图片中苹果边缘细节信息的丢失,增加了卷积块注意力模块,对背景信息进行过滤,提高特征提取的准确性,并有效减少特征细节和深度信息的损失。本文还探讨了腐烂度的影响,腐烂的位置,苹果品种,和背景复杂度对苹果图像语义分割性能的影响,验证了该方法的鲁棒性。实验结果表明,该模型的PA可以达到95.3%,MIoU可以达到87.1%。与DeepLabV3+相比,分别提高了3.4%和3.1%,分别,优于其他语义分割网络,如UNet和PSPNet。此外,本文提出的DeepMDSCBA模型在腐烂零件的程度或位置等不同因素下具有比其他考虑方法更好的性能,苹果品种,复杂的背景。
    The semantic segmentation of apples from images plays an important role in the automation of the apple industry. However, existing semantic segmentation methods such as FCN and UNet have the disadvantages of a low speed and accuracy for the segmentation of apple images with complex backgrounds or rotten parts. In view of these problems, a network segmentation model based on deep learning, DeepMDSCBA, is proposed in this paper. The model is based on the DeepLabV3+ structure, and a lightweight MobileNet module is used in the encoder for the extraction of features, which can reduce the amount of parameter calculations and the memory requirements. Instead of ordinary convolution, depthwise separable convolution is used in DeepMDSCBA to reduce the number of parameters to improve the calculation speed. In the feature extraction module and the cavity space pyramid pooling module of DeepMDSCBA, a Convolutional Block Attention module is added to filter background information in order to reduce the loss of the edge detail information of apples in images, improve the accuracy of feature extraction, and effectively reduce the loss of feature details and deep information. This paper also explored the effects of rot degree, rot position, apple variety, and background complexity on the semantic segmentation performance of apple images, and then it verified the robustness of the method. The experimental results showed that the PA of this model could reach 95.3% and the MIoU could reach 87.1%, which were improved by 3.4% and 3.1% compared with DeepLabV3+, respectively, and superior to those of other semantic segmentation networks such as UNet and PSPNet. In addition, the DeepMDSCBA model proposed in this paper was shown to have a better performance than the other considered methods under different factors such as the degree or position of rotten parts, apple varieties, and complex backgrounds.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号