Efficient channel attention

  • 文章类型: Journal Article
    The number of wheat spikes has an important influence on wheat yield, and the rapid and accurate detection of wheat spike numbers is of great significance for wheat yield estimation and food security. Computer vision and machine learning have been widely studied as potential alternatives to human detection. However, models with high accuracy are computationally intensive and time consuming, and lightweight models tend to have lower precision. To address these concerns, YOLO-FastestV2 was selected as the base model for the comprehensive study and analysis of wheat sheaf detection. In this study, we constructed a wheat target detection dataset comprising 11,451 images and 496,974 bounding boxes. The dataset for this study was constructed based on the Global Wheat Detection Dataset and the Wheat Sheaf Detection Dataset, which was published by PP Flying Paddle. We selected three attention mechanisms, Large Separable Kernel Attention (LSKA), Efficient Channel Attention (ECA), and Efficient Multi-Scale Attention (EMA), to enhance the feature extraction capability of the backbone network and improve the accuracy of the underlying model. First, the attention mechanism was added after the base and output phases of the backbone network. Second, the attention mechanism that further improved the model accuracy after the base and output phases was selected to construct the model with a two-phase added attention mechanism. On the other hand, we constructed SimLightFPN to improve the model accuracy by introducing SimConv to improve the LightFPN module. The results of the study showed that the YOLO-FastestV2-SimLightFPN-ECA-EMA hybrid model, which incorporates the ECA attention mechanism in the base stage and introduces the EMA attention mechanism and the combination of SimLightFPN modules in the output stage, has the best overall performance. The accuracy of the model was P=83.91%, R=78.35%, AP= 81.52%, and F1 = 81.03%, and it ranked first in the GPI (0.84) in the overall evaluation. The research examines the deployment of wheat ear detection and counting models on devices with constrained resources, delivering novel solutions for the evolution of agricultural automation and precision agriculture.






  • 文章类型: Journal Article
    Addressing the critical need for accurate fall event detection due to their potentially severe impacts, this paper introduces the Spatial Channel and Pooling Enhanced You Only Look Once version 5 small (SCPE-YOLOv5s) model. Fall events pose a challenge for detection due to their varying scales and subtle pose features. To address this problem, SCPE-YOLOv5s introduces spatial attention to the Efficient Channel Attention (ECA) network, which significantly enhances the model\'s ability to extract features from spatial pose distribution. Moreover, the model integrates average pooling layers into the Spatial Pyramid Pooling (SPP) network to support the multi-scale extraction of fall poses. Meanwhile, by incorporating the ECA network into SPP, the model effectively combines global and local features to further enhance the feature extraction. This paper validates the SCPE-YOLOv5s on a public dataset, demonstrating that it achieves a mean Average Precision of 88.29 %, outperforming the You Only Look Once version 5 small by 4.87 %. Additionally, the model achieves 57.4 frames per second. Therefore, SCPE-YOLOv5s provides a novel solution for fall event detection.






  • 文章类型: Journal Article
    Around 70 million people worldwide are affected by epilepsy, a neurological disorder characterized by non-induced seizures that occur at irregular and unpredictable intervals. During an epileptic seizure, transient symptoms emerge as a result of extreme abnormal neural activity. Epilepsy imposes limitations on individuals and has a significant impact on the lives of their families. Therefore, the development of reliable diagnostic tools for the early detection of this condition is considered beneficial to alleviate the social and emotional distress experienced by patients. While the Bonn University dataset contains five collections of EEG data, not many studies specifically focus on subsets D and E. These subsets correspond to EEG recordings from the epileptogenic zone during ictal and interictal events. In this work, the parallel ictal-net (PIN) neural network architecture is introduced, which utilizes scalograms obtained through a continuous wavelet transform to achieve the high-accuracy classification of EEG signals into ictal or interictal states. The results obtained demonstrate the effectiveness of the proposed PIN model in distinguishing between ictal and interictal events with a high degree of confidence. This is validated by the computing accuracy, precision, recall, and F1 scores, all of which consistently achieve around 99% confidence, surpassing previous approaches in the related literature.






  • 文章类型: Journal Article
    Accurate segmentation of breast ultrasound (BUS) images is crucial for early diagnosis and treatment of breast cancer. Further, the task of segmenting lesions in BUS images continues to pose significant challenges due to the limitations of convolutional neural networks (CNNs) in capturing long-range dependencies and obtaining global context information. Existing methods relying solely on CNNs have struggled to address these issues. Recently, ConvNeXts have emerged as a promising architecture for CNNs, while transformers have demonstrated outstanding performance in diverse computer vision tasks, including the analysis of medical images. In this paper, we propose a novel breast lesion segmentation network CS-Net that combines the strengths of ConvNeXt and Swin Transformer models to enhance the performance of the U-Net architecture. Our network operates on BUS images and adopts an end-to-end approach to perform segmentation. To address the limitations of CNNs, we design a hybrid encoder that incorporates modified ConvNeXt convolutions and Swin Transformer. Furthermore, to enhance capturing the spatial and channel attention in feature maps we incorporate the Coordinate Attention Module. Second, we design an Encoder-Decoder Features Fusion Module that facilitates the fusion of low-level features from the encoder with high-level semantic features from the decoder during the image reconstruction. Experimental results demonstrate the superiority of our network over state-of-the-art image segmentation methods for BUS lesions segmentation.






  • 文章类型: Journal Article
    Alzheimer\'s disease (AD) is a progressive neurodegenerative disease. Early detection and intervention are crucial in preventing the progression of AD. To achieve efficient and scalable AD auto-detection based on structural Magnetic Resonance Imaging (sMRI), a lightweight neural network using multi-slice sMRI is proposed in this paper. The backbone for feature extraction is based on ShuffleNet V1 architecture, which is effective for overcoming the limitations posed by limited sMRI data and resource-restricted devices. In addition, we incorporate Efficient Channel Attention (ECA) to capture cross-channel interaction information, enabling us to effectively enhance features of disease associated brain regions. To optimize the model, we employ both cross entropy loss and triplet loss functions to constrain the predicted probabilities to the ground-truth labels, and to ensure appropriate representation of distances between different classes in the learned features. Experimental results show that the classification accuracies of our method for AD vs. CN, AD vs. MCI, and MCI vs. CN classification tasks are 95.00%, 87.50%, and 85.62% respectively. Our method utilizes only 3.42 M parameters and 6.08G FLOPs, while maintaining a comparable level of performance compared to the other 5 latest lightweight methods. This model design is computationally efficient, allowing it to process large amounts of data quickly and accurately in a timely manner. Additionally, it has the potential to advance the intelligent detection of Alzheimer\'s disease on devices with limited computing capabilities.






  • 文章类型: Journal Article
    Brain-computer interface (BCI) enables the control of external devices using signals from the brain, offering immense potential in assisting individuals with neuromuscular disabilities. Among the different paradigms of BCI systems, the motor imagery (MI) based electroencephalogram (EEG) signal is widely recognized as exceptionally promising. Deep learning (DL) has found extensive applications in the processing of MI signals, wherein convolutional neural networks (CNN) have demonstrated superior performance compared to conventional machine learning (ML) approaches. Nevertheless, challenges related to subject independence and subject dependence persist, while the inherent low signal-to-noise ratio of EEG signals remains a critical aspect that demands attention. Accurately deciphering intentions from EEG signals continues to present a formidable challenge. This paper introduces an advanced end-to-end network that effectively combines the efficient channel attention (ECA) and temporal convolutional network (TCN) components for the classification of motor imagination signals. We incorporated an ECA module prior to feature extraction in order to enhance the extraction of channel-specific features. A compact convolutional network model uses for feature extraction in the middle part. Finally, the time characteristic information is obtained by using TCN. The results show that our network is a lightweight network that is characterized by few parameters and fast speed. Our network achieves an average accuracy of 80.71% on the BCI Competition IV-2a dataset.






  • 文章类型: Journal Article
    With weight-sharing and continuous relaxation strategies, the differentiable architecture search (DARTS) proposes a fast and effective solution to perform neural network architecture search in various deep learning tasks. However, unresolved issues, such as the inefficient memory utilization, and the poor stability of the search architecture due to channels randomly selected, which has even caused performance collapses, are still perplexing researchers and practitioners. In this paper, a novel efficient channel attention mechanism based on partial channel connection for differentiable neural architecture search, termed EPC-DARTS, is proposed to address these two issues. Specifically, we design an efficient channel attention module, which is applied to capture cross-channel interactions and assign weight based on channel importance, to dramatically improve search efficiency and reduce memory occupation. Moreover, only partial channels with higher weights in the mixed calculation of operation are used through the efficient channel attention mechanism, and thus unstable network architectures obtained by the random selection operation can also be avoided in the proposed EPC-DARTS. Experimental results show that the proposed EPC-DARTS achieves remarkably competitive performance (CIFAR-10/CIFAR-100: a test accuracy rate of 97.60%/84.02%), compared to other state-of-the-art NAS methods using only 0.2 GPU-Days.






  • 文章类型: Journal Article
    Early accurate mammography screening and diagnosis can reduce the mortality of breast cancer. Although CNN-based breast cancer computer-aided diagnosis (CAD) systems have achieved significant results in recent years, precise diagnosis of lesions in mammogram remains a challenge due to low signal-to-noise ratio (SNR) and physiological characteristics. Many researchers achieved excellent performance in detecting mammographic images by inputting region of interest (ROI) annotations while ROI annotations require a great quantity of manual labor, time and resources. We propose a two-stage method that combines images preprocessing and model optimization to address the aforementioned challenges. Firstly, we propose the breast database preprocess (BDP) method to preprocess INbreast then we get INbreast†. The only label we need is benign or malignant label of one mammogram, not manual labeling such as ROI annotations. Secondly, we apply focal loss to ECA-Net50 which is an improved model based on ResNet50 with efficient channel attention (ECA) module. Our method can adaptively extract the key features of mammograms, meanwhile solving the problem of hard-to-classify samples and unbalanced categories. The AUC value of our method on INbreast† is 0.960, accuracy is 0.929, Recall is 0.928. The precision of our method on INbreast† is 0.883 which improved by 0.254 compared to ResNet50. In addition, we use Grad-CAM to visualize the effect of our model. The visualized heatmaps extracted by our method can focus more on lesion regions. Both numerical and visualized experiments demonstrate that our method achieves satisfactory performance.






  • 文章类型: Journal Article
    In recent years, deep learning has been applied to intelligent fault diagnosis and has achieved great success. However, the fault diagnosis method of deep learning assumes that the training dataset and the test dataset are obtained under the same operating conditions. This condition can hardly be met in real application scenarios. Additionally, signal preprocessing technology also has an important influence on intelligent fault diagnosis. How to effectively relate signal preprocessing to a transfer diagnostic model is a challenge. To solve the above problems, we propose a novel deep transfer learning method for intelligent fault diagnosis based on Variational Mode Decomposition (VMD) and Efficient Channel Attention (ECA). In the proposed method, the VMD adaptively matches the optimal center frequency and finite bandwidth of each mode to achieve effective separation of signals. To fuse the mode features more effectively after VMD decomposition, ECA is used to learn channel attention. The experimental results show that the proposed signal preprocessing and feature fusion module can increase the accuracy and generality of the transfer diagnostic model. Moreover, we comprehensively analyze and compare our method with state-of-the-art methods at different noise levels, and the results show that our proposed method has better robustness and generalization performance.





