Efficient channel attention

  • 文章类型: Journal Article
    小麦穗数对小麦产量有重要影响,小麦穗数的快速准确检测对小麦估产和粮食安全具有重要意义。计算机视觉和机器学习已被广泛研究作为人类检测的潜在替代方案。然而,具有高精度的模型计算密集且耗时,和轻量级模型往往具有较低的精度。为了解决这些问题,选择YOLO-FastestV2作为基础模型,对小麦捆检测进行综合研究分析。在这项研究中,我们构建了一个小麦目标检测数据集,包括11,451张图像和496,974个边界框。本研究的数据集是基于全球小麦检测数据集和小麦捆检测数据集构建的,由PP飞桨出版。我们选择了三种关注机制,大型可分离内核注意力(LSKA),高效渠道注意力(ECA)和有效的多尺度注意力(EMA),增强骨干网的特征提取能力,提高底层模型的精度。首先,注意机制是在骨干网的基础阶段和输出阶段之后添加的。第二,选择在基础和输出阶段之后进一步提高模型准确性的注意力机制来构建具有两阶段附加注意力机制的模型。另一方面,通过引入SimConv对LightFPN模块进行改进,构建了SimLightFPN模型以提高模型精度。研究结果表明,YOLO-FastestV2-SimLightFPN-ECA-EMA混合模型,它在基础阶段引入了ECA注意力机制,并在输出阶段引入了EMA注意力机制和SimLightFPN模块的组合,具有最佳的整体性能。模型的准确率为P=83.91%,R=78.35%,AP=81.52%,F1=81.03%,在总体评估中,它在GPI中排名第一(0.84)。该研究检查了在资源有限的设备上部署麦穗检测和计数模型,为农业自动化和精准农业的发展提供新的解决方案。
    The number of wheat spikes has an important influence on wheat yield, and the rapid and accurate detection of wheat spike numbers is of great significance for wheat yield estimation and food security. Computer vision and machine learning have been widely studied as potential alternatives to human detection. However, models with high accuracy are computationally intensive and time consuming, and lightweight models tend to have lower precision. To address these concerns, YOLO-FastestV2 was selected as the base model for the comprehensive study and analysis of wheat sheaf detection. In this study, we constructed a wheat target detection dataset comprising 11,451 images and 496,974 bounding boxes. The dataset for this study was constructed based on the Global Wheat Detection Dataset and the Wheat Sheaf Detection Dataset, which was published by PP Flying Paddle. We selected three attention mechanisms, Large Separable Kernel Attention (LSKA), Efficient Channel Attention (ECA), and Efficient Multi-Scale Attention (EMA), to enhance the feature extraction capability of the backbone network and improve the accuracy of the underlying model. First, the attention mechanism was added after the base and output phases of the backbone network. Second, the attention mechanism that further improved the model accuracy after the base and output phases was selected to construct the model with a two-phase added attention mechanism. On the other hand, we constructed SimLightFPN to improve the model accuracy by introducing SimConv to improve the LightFPN module. The results of the study showed that the YOLO-FastestV2-SimLightFPN-ECA-EMA hybrid model, which incorporates the ECA attention mechanism in the base stage and introduces the EMA attention mechanism and the combination of SimLightFPN modules in the output stage, has the best overall performance. The accuracy of the model was P=83.91%, R=78.35%, AP= 81.52%, and F1 = 81.03%, and it ranked first in the GPI (0.84) in the overall evaluation. The research examines the deployment of wheat ear detection and counting models on devices with constrained resources, delivering novel solutions for the evolution of agricultural automation and precision agriculture.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    解决由于潜在的严重影响而导致的准确跌倒事件检测的关键需求,本文介绍了空间信道和池化增强YouOnlyLookOnce版本5小(SCPE-YOLOv5s)模型。跌倒事件由于其变化的尺度和微妙的姿势特征而对检测提出了挑战。为了解决这个问题,SCPE-YOLOv5将空间注意力引入了高效信道注意力(ECA)网络,这显著增强了模型从空间姿态分布中提取特征的能力。此外,该模型将平均池化层集成到空间金字塔池(SPP)网络中,以支持跌倒姿势的多尺度提取。同时,通过将ECA网络纳入SPP,该模型有效地结合了全局和局部特征,进一步增强了特征提取。本文在公共数据集上验证了SCPE-YOLOv5,证明它达到了88.29%的平均精度,表现优于你只看一次版本5小4.87%。此外,该模型实现每秒57.4帧。因此,SCPE-YOLOv5s为跌倒事件检测提供了一种新颖的解决方案。
    Addressing the critical need for accurate fall event detection due to their potentially severe impacts, this paper introduces the Spatial Channel and Pooling Enhanced You Only Look Once version 5 small (SCPE-YOLOv5s) model. Fall events pose a challenge for detection due to their varying scales and subtle pose features. To address this problem, SCPE-YOLOv5s introduces spatial attention to the Efficient Channel Attention (ECA) network, which significantly enhances the model\'s ability to extract features from spatial pose distribution. Moreover, the model integrates average pooling layers into the Spatial Pyramid Pooling (SPP) network to support the multi-scale extraction of fall poses. Meanwhile, by incorporating the ECA network into SPP, the model effectively combines global and local features to further enhance the feature extraction. This paper validates the SCPE-YOLOv5s on a public dataset, demonstrating that it achieves a mean Average Precision of 88.29 %, outperforming the You Only Look Once version 5 small by 4.87 %. Additionally, the model achieves 57.4 frames per second. Therefore, SCPE-YOLOv5s provides a novel solution for fall event detection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    全世界约有7000万人受到癫痫的影响,一种神经系统疾病,其特征是以不规则和不可预测的间隔发生的非诱发性癫痫发作。在癫痫发作期间,短暂的症状是神经活动异常的结果。癫痫对个人施加限制,并对其家庭生活产生重大影响。因此,开发可靠的诊断工具来早期发现这种情况被认为有助于减轻患者所经历的社会和情绪困扰。虽然波恩大学的数据集包含五个EEG数据集,没有多少研究专门关注子集D和E。这些子集对应于发作和发作间事件期间癫痫发生区的EEG记录。在这项工作中,引入了并行智能网络(PIN)神经网络架构,它利用通过连续小波变换获得的图像来实现将脑电信号分类为发作或发作间状态的高精度分类。获得的结果表明,所提出的PIN模型在区分发作和发作间事件方面具有很高的置信度。这通过计算精度得到了验证,精度,召回,和F1得分,所有这些都始终如一地实现了99%左右的信心,在相关文献中超越了以前的方法。
    Around 70 million people worldwide are affected by epilepsy, a neurological disorder characterized by non-induced seizures that occur at irregular and unpredictable intervals. During an epileptic seizure, transient symptoms emerge as a result of extreme abnormal neural activity. Epilepsy imposes limitations on individuals and has a significant impact on the lives of their families. Therefore, the development of reliable diagnostic tools for the early detection of this condition is considered beneficial to alleviate the social and emotional distress experienced by patients. While the Bonn University dataset contains five collections of EEG data, not many studies specifically focus on subsets D and E. These subsets correspond to EEG recordings from the epileptogenic zone during ictal and interictal events. In this work, the parallel ictal-net (PIN) neural network architecture is introduced, which utilizes scalograms obtained through a continuous wavelet transform to achieve the high-accuracy classification of EEG signals into ictal or interictal states. The results obtained demonstrate the effectiveness of the proposed PIN model in distinguishing between ictal and interictal events with a high degree of confidence. This is validated by the computing accuracy, precision, recall, and F1 scores, all of which consistently achieve around 99% confidence, surpassing previous approaches in the related literature.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    乳腺超声图像的精确分割对乳腺癌的早期诊断和治疗至关重要。Further,由于卷积神经网络(CNN)在捕获远程依赖关系和获取全局上下文信息方面的局限性,在BUS图像中分割病变的任务继续带来重大挑战.仅依赖于CNN的现有方法一直在努力解决这些问题。最近,ConvNeXts已经成为CNN的一个有前途的架构,虽然变压器在各种计算机视觉任务中表现突出,包括医学图像的分析。在本文中,我们提出了一种新颖的乳腺病变分割网络CS-Net,它结合了ConvNeXt和SwinTransformer模型的优势,以增强U-Net架构的性能。我们的网络在总线图像上运行,并采用端到端方法来执行分割。为了解决CNN的局限性,我们设计了一个混合编码器,结合了改进的ConvNeXt卷积和SwinTransformer。此外,为了增强在特征图中捕获空间和通道注意力,我们结合了协调注意力模块。第二,我们设计了一个编码器-解码器特征融合模块,该模块有助于在图像重建期间将来自编码器的低级特征与来自解码器的高级语义特征融合。实验结果表明,我们的网络优于最先进的BUS病变分割图像分割方法。
    Accurate segmentation of breast ultrasound (BUS) images is crucial for early diagnosis and treatment of breast cancer. Further, the task of segmenting lesions in BUS images continues to pose significant challenges due to the limitations of convolutional neural networks (CNNs) in capturing long-range dependencies and obtaining global context information. Existing methods relying solely on CNNs have struggled to address these issues. Recently, ConvNeXts have emerged as a promising architecture for CNNs, while transformers have demonstrated outstanding performance in diverse computer vision tasks, including the analysis of medical images. In this paper, we propose a novel breast lesion segmentation network CS-Net that combines the strengths of ConvNeXt and Swin Transformer models to enhance the performance of the U-Net architecture. Our network operates on BUS images and adopts an end-to-end approach to perform segmentation. To address the limitations of CNNs, we design a hybrid encoder that incorporates modified ConvNeXt convolutions and Swin Transformer. Furthermore, to enhance capturing the spatial and channel attention in feature maps we incorporate the Coordinate Attention Module. Second, we design an Encoder-Decoder Features Fusion Module that facilitates the fusion of low-level features from the encoder with high-level semantic features from the decoder during the image reconstruction. Experimental results demonstrate the superiority of our network over state-of-the-art image segmentation methods for BUS lesions segmentation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    阿尔茨海默病(AD)是一种进行性神经退行性疾病。早期发现和干预对于预防AD的进展至关重要。为了实现基于结构磁共振成像(sMRI)的高效且可扩展的AD自动检测,本文提出了一种基于多层磁共振成像的轻量级神经网络。特征提取的主干基于ShuffleNetV1架构,这对于克服有限的sMRI数据和资源受限的设备带来的限制是有效的。此外,我们结合了有效的渠道注意力(ECA)来捕获跨渠道交互信息,使我们能够有效地增强疾病相关大脑区域的特征。为了优化模型,我们采用交叉熵损失和三元组损失函数来将预测概率约束到地面实况标签,并确保适当表示学习特征中不同类别之间的距离。实验结果表明,我们的方法对AD的分类精度与CN,ADvs.MCI和MCI与CN分类任务为95.00%,87.50%,分别为85.62%。我们的方法仅使用3.42M参数和6.08GFLOP,与其他5种最新的轻量级方法相比,同时保持可比的性能水平。这种模型设计计算效率高,使其能够及时快速准确地处理大量数据。此外,它有可能在计算能力有限的设备上推进阿尔茨海默病的智能检测。
    Alzheimer\'s disease (AD) is a progressive neurodegenerative disease. Early detection and intervention are crucial in preventing the progression of AD. To achieve efficient and scalable AD auto-detection based on structural Magnetic Resonance Imaging (sMRI), a lightweight neural network using multi-slice sMRI is proposed in this paper. The backbone for feature extraction is based on ShuffleNet V1 architecture, which is effective for overcoming the limitations posed by limited sMRI data and resource-restricted devices. In addition, we incorporate Efficient Channel Attention (ECA) to capture cross-channel interaction information, enabling us to effectively enhance features of disease associated brain regions. To optimize the model, we employ both cross entropy loss and triplet loss functions to constrain the predicted probabilities to the ground-truth labels, and to ensure appropriate representation of distances between different classes in the learned features. Experimental results show that the classification accuracies of our method for AD vs. CN, AD vs. MCI, and MCI vs. CN classification tasks are 95.00%, 87.50%, and 85.62% respectively. Our method utilizes only 3.42 M parameters and 6.08G FLOPs, while maintaining a comparable level of performance compared to the other 5 latest lightweight methods. This model design is computationally efficient, allowing it to process large amounts of data quickly and accurately in a timely manner. Additionally, it has the potential to advance the intelligent detection of Alzheimer\'s disease on devices with limited computing capabilities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    脑机接口(BCI)可以使用来自大脑的信号控制外部设备,在帮助患有神经肌肉残疾的人方面提供了巨大的潜力。在BCI系统的不同范式中,基于运动图像(MI)的脑电图(EEG)信号被广泛认为是非常有前途的。深度学习(DL)在MI信号处理中得到了广泛的应用,其中,卷积神经网络(CNN)与传统机器学习方法相比表现出了卓越的性能。然而,与主体独立性和主体依赖性相关的挑战仍然存在,而脑电图信号固有的低信噪比仍然是一个需要注意的关键方面。从EEG信号中准确破译意图仍然是一个巨大的挑战。本文介绍了一种先进的端到端网络,该网络有效地结合了有效的信道注意力(ECA)和时间卷积网络(TCN)组件,用于对运动想象信号进行分类。我们在特征提取之前合并了ECA模块,以增强通道特定特征的提取。紧凑的卷积网络模型用于中间部分的特征提取。最后,利用TCN获得时间特征信息。结果表明,我们的网络是一个轻量级的网络,具有参数少,速度快的特点。我们的网络在BCICompetitionIV-2a数据集上的平均准确率为80.71%。
    Brain-computer interface (BCI) enables the control of external devices using signals from the brain, offering immense potential in assisting individuals with neuromuscular disabilities. Among the different paradigms of BCI systems, the motor imagery (MI) based electroencephalogram (EEG) signal is widely recognized as exceptionally promising. Deep learning (DL) has found extensive applications in the processing of MI signals, wherein convolutional neural networks (CNN) have demonstrated superior performance compared to conventional machine learning (ML) approaches. Nevertheless, challenges related to subject independence and subject dependence persist, while the inherent low signal-to-noise ratio of EEG signals remains a critical aspect that demands attention. Accurately deciphering intentions from EEG signals continues to present a formidable challenge. This paper introduces an advanced end-to-end network that effectively combines the efficient channel attention (ECA) and temporal convolutional network (TCN) components for the classification of motor imagination signals. We incorporated an ECA module prior to feature extraction in order to enhance the extraction of channel-specific features. A compact convolutional network model uses for feature extraction in the middle part. Finally, the time characteristic information is obtained by using TCN. The results show that our network is a lightweight network that is characterized by few parameters and fast speed. Our network achieves an average accuracy of 80.71% on the BCI Competition IV-2a dataset.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    通过分担体重和持续放松策略,可微分架构搜索(DARTS)提出了一种快速有效的解决方案,用于在各种深度学习任务中执行神经网络架构搜索。然而,未解决的问题,例如低效的内存利用率,以及由于随机选择的通道而导致的搜索架构稳定性差,这甚至导致了性能崩溃,仍然困扰着研究人员和从业者。在本文中,一种基于部分信道连接的新型高效信道注意力机制,用于可微分神经结构搜索,称为EPC-DARTS,是为了解决这两个问题而提出的。具体来说,我们设计了一个高效的频道注意力模块,用于捕获跨通道交互并根据通道重要性分配权重,大大提高搜索效率,减少内存占用。此外,通过有效的信道注意力机制,仅使用混合计算中权重较高的部分信道,并且因此在所提出的EPC-DARTS中也可以避免通过随机选择操作获得的不稳定的网络架构。实验结果表明,所提出的EPC-DARTS具有显著的竞争力(CIFAR-10/CIFAR-100:测试准确率为97.60%/84.02%),与仅使用0.2个GPU天的其他最先进的NAS方法相比。
    With weight-sharing and continuous relaxation strategies, the differentiable architecture search (DARTS) proposes a fast and effective solution to perform neural network architecture search in various deep learning tasks. However, unresolved issues, such as the inefficient memory utilization, and the poor stability of the search architecture due to channels randomly selected, which has even caused performance collapses, are still perplexing researchers and practitioners. In this paper, a novel efficient channel attention mechanism based on partial channel connection for differentiable neural architecture search, termed EPC-DARTS, is proposed to address these two issues. Specifically, we design an efficient channel attention module, which is applied to capture cross-channel interactions and assign weight based on channel importance, to dramatically improve search efficiency and reduce memory occupation. Moreover, only partial channels with higher weights in the mixed calculation of operation are used through the efficient channel attention mechanism, and thus unstable network architectures obtained by the random selection operation can also be avoided in the proposed EPC-DARTS. Experimental results show that the proposed EPC-DARTS achieves remarkably competitive performance (CIFAR-10/CIFAR-100: a test accuracy rate of 97.60%/84.02%), compared to other state-of-the-art NAS methods using only 0.2 GPU-Days.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    早期准确的乳腺X线筛查和诊断可以降低乳腺癌的死亡率。尽管基于CNN的乳腺癌计算机辅助诊断(CAD)系统近年来取得了显著成果,由于低信噪比(SNR)和生理特征,在乳房X线照片中精确诊断病变仍然是一个挑战.许多研究人员通过输入感兴趣区域(ROI)注释在检测乳房摄影图像方面取得了出色的性能,而ROI注释需要大量的体力劳动,时间和资源。我们提出了一种结合图像预处理和模型优化的两阶段方法来解决上述挑战。首先,我们提出了乳房数据库预处理(BDP)方法来预处理INbast,然后我们得到INbast。我们唯一需要的标签是一张乳房X光照片的良性或恶性标签,不是手动标记,如ROI注释。其次,我们将焦点损失应用于ECA-Net50,这是一种基于ResNet50的改进模型,具有有效的信道注意(ECA)模块。我们的方法可以自适应地提取乳房X线照片的关键特征,同时解决了样本难以分类和类别不平衡的问题。我们的方法在乳腺内的AUC值为0.960,准确性为0.929,召回率为0.928。我们的方法在INbast上的精度为0.883,与ResNet50相比提高了0.254。此外,我们使用Grad-CAM来可视化我们模型的效果。通过我们的方法提取的可视化热图可以更多地关注病变区域。数值和可视化实验都表明我们的方法获得了令人满意的性能。
    Early accurate mammography screening and diagnosis can reduce the mortality of breast cancer. Although CNN-based breast cancer computer-aided diagnosis (CAD) systems have achieved significant results in recent years, precise diagnosis of lesions in mammogram remains a challenge due to low signal-to-noise ratio (SNR) and physiological characteristics. Many researchers achieved excellent performance in detecting mammographic images by inputting region of interest (ROI) annotations while ROI annotations require a great quantity of manual labor, time and resources. We propose a two-stage method that combines images preprocessing and model optimization to address the aforementioned challenges. Firstly, we propose the breast database preprocess (BDP) method to preprocess INbreast then we get INbreast†. The only label we need is benign or malignant label of one mammogram, not manual labeling such as ROI annotations. Secondly, we apply focal loss to ECA-Net50 which is an improved model based on ResNet50 with efficient channel attention (ECA) module. Our method can adaptively extract the key features of mammograms, meanwhile solving the problem of hard-to-classify samples and unbalanced categories. The AUC value of our method on INbreast† is 0.960, accuracy is 0.929, Recall is 0.928. The precision of our method on INbreast† is 0.883 which improved by 0.254 compared to ResNet50. In addition, we use Grad-CAM to visualize the effect of our model. The visualized heatmaps extracted by our method can focus more on lesion regions. Both numerical and visualized experiments demonstrate that our method achieves satisfactory performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    近年来,深度学习已经应用于智能故障诊断,并取得了很大的成功。然而,深度学习的故障诊断方法假设训练数据集和测试数据集是在相同的运行条件下获得的。在实际应用场景中很难满足这个条件。此外,信号预处理技术对智能故障诊断也有着重要的影响。如何有效地将信号预处理与传输诊断模型相关联是一个挑战。为了解决上述问题,提出了一种基于变分模态分解(VMD)和高效信道注意力(ECA)的智能故障诊断深度迁移学习方法。在提出的方法中,VMD自适应地匹配每个模式的最佳中心频率和有限带宽,以实现信号的有效分离。为了在VMD分解后更有效地融合模式特征,ECA用于学习渠道注意力。实验结果表明,所提出的信号预处理和特征融合模块可以提高传输诊断模型的准确性和通用性。此外,我们在不同的噪声水平下,全面分析和比较我们的方法与最先进的方法,结果表明,该方法具有较好的鲁棒性和泛化性能。
    In recent years, deep learning has been applied to intelligent fault diagnosis and has achieved great success. However, the fault diagnosis method of deep learning assumes that the training dataset and the test dataset are obtained under the same operating conditions. This condition can hardly be met in real application scenarios. Additionally, signal preprocessing technology also has an important influence on intelligent fault diagnosis. How to effectively relate signal preprocessing to a transfer diagnostic model is a challenge. To solve the above problems, we propose a novel deep transfer learning method for intelligent fault diagnosis based on Variational Mode Decomposition (VMD) and Efficient Channel Attention (ECA). In the proposed method, the VMD adaptively matches the optimal center frequency and finite bandwidth of each mode to achieve effective separation of signals. To fuse the mode features more effectively after VMD decomposition, ECA is used to learn channel attention. The experimental results show that the proposed signal preprocessing and feature fusion module can increase the accuracy and generality of the transfer diagnostic model. Moreover, we comprehensively analyze and compare our method with state-of-the-art methods at different noise levels, and the results show that our proposed method has better robustness and generalization performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号