Spatial pyramid pooling

  • 文章类型: Journal Article
    BACKGROUND: Sleep spindles have emerged as valuable biomarkers for assessing cognitive abilities and related disorders, underscoring the importance of their detection in clinical research. However, template matching-based algorithms using fixed templates may not be able to fully adapt to spindles of different durations. Moreover, inspired by the multiscale feature extraction of images, the use of multiscale feature extraction methods can be used to better adapt to spindles of different frequencies and durations.
    METHODS: Therefore, this study proposes a novel automatic spindle detection algorithm based on elastic time windows and spatial pyramid pooling (SPP) for extracting multiscale features. The algorithm utilizes elastic time windows to segment electroencephalogram (EEG) signals, enabling the extraction of features across multiple scales. This approach accommodates significant variations in spindle duration and polarization positioning during different EEG epochs. Additionally, spatial pyramid pooling is integrated into a depthwise separable convolutional (DSC) network to perform multiscale pooling on the segmented spindle signal features at different scales.
    RESULTS: Compared with existing template matching algorithms, this algorithm\'s spindle wave polarization positioning is more consistent with the real situation. Experimental results conducted on the public dataset DREAMS show that the average accuracy of this algorithm reaches 95.75%, with an average negative predictive value (NPV) of 96.55%, indicating its advanced performance.
    CONCLUSIONS: The effectiveness of each module was verified through thorough ablation experiments. More importantly, the algorithm shows strong robustness when faced with changes in different experimental subjects. This feature makes the algorithm more accurate at identifying sleep spindles and is expected to help experts automatically detect spindles in sleep EEG signals, reduce the workload and time of manual detection, and improve efficiency.






  • 文章类型: Journal Article
    Crop yield production could be enhanced for agricultural growth if various plant nutrition deficiencies, and diseases are identified and detected at early stages. Hence, continuous health monitoring of plant is very crucial for handling plant stress. The deep learning methods have proven its superior performances in the automated detection of plant diseases and nutrition deficiencies from visual symptoms in leaves. This article proposes a new deep learning method for plant nutrition deficiencies and disease classification using a graph convolutional network (GNN), added upon a base convolutional neural network (CNN). Sometimes, a global feature descriptor might fail to capture the vital region of a diseased leaf, which causes inaccurate classification of disease. To address this issue, regional feature learning is crucial for a holistic feature aggregation. In this work, region-based feature summarization at multi-scales is explored using spatial pyramidal pooling for discriminative feature representation. Furthermore, a GCN is developed to capacitate learning of finer details for classifying plant diseases and insufficiency of nutrients. The proposed method, called Plant Nutrition Deficiency and Disease Network (PND-Net), has been evaluated on two public datasets for nutrition deficiency, and two for disease classification using four backbone CNNs. The best classification performances of the proposed PND-Net are as follows: (a) 90.00% Banana and 90.54% Coffee nutrition deficiency; and (b) 96.18% Potato diseases and 84.30% on PlantDoc datasets using Xception backbone. Furthermore, additional experiments have been carried out for generalization, and the proposed method has achieved state-of-the-art performances on two public datasets, namely the Breast Cancer Histopathology Image Classification (BreakHis 40 × : 95.50%, and BreakHis 100 × : 96.79% accuracy) and Single cells in Pap smear images for cervical cancer classification (SIPaKMeD: 99.18% accuracy). Also, the proposed method has been evaluated using five-fold cross validation and achieved improved performances on these datasets. Clearly, the proposed PND-Net effectively boosts the performances of automated health analysis of various plants in real and intricate field environments, implying PND-Net\'s aptness for agricultural growth as well as human cancer classification.






  • 文章类型: Journal Article
    Protein-protein interactions (PPIs) play an essential role in life activities. Many artificial intelligence algorithms based on protein sequence information have been developed to predict PPIs. However, these models have difficulty dealing with various sequence lengths and suffer from low generalization and prediction accuracy. In this study, we proposed a novel end-to-end deep learning framework, RSPPI, combining residual neural network (ResNet) and spatial pyramid pooling (SPP), to predict PPIs based on the protein sequence physicochemistry properties and spatial structural information. In the RSPPI model, ResNet was employed to extract the structural and physicochemical information from the protein three-dimensional structure and primary sequence; the SPP layer was used to transform feature maps to a single vector and avoid the fixed-length requirement. The RSPPI model possessed excellent cross-species performance and outperformed several state-of-the-art methods based either on protein sequence or gene ontology in most evaluation metrics. The RSPPI model provides a novel strategy to develop an AI PPI prediction algorithm.






  • 文章类型: Journal Article
    Addressing the critical need for accurate fall event detection due to their potentially severe impacts, this paper introduces the Spatial Channel and Pooling Enhanced You Only Look Once version 5 small (SCPE-YOLOv5s) model. Fall events pose a challenge for detection due to their varying scales and subtle pose features. To address this problem, SCPE-YOLOv5s introduces spatial attention to the Efficient Channel Attention (ECA) network, which significantly enhances the model\'s ability to extract features from spatial pose distribution. Moreover, the model integrates average pooling layers into the Spatial Pyramid Pooling (SPP) network to support the multi-scale extraction of fall poses. Meanwhile, by incorporating the ECA network into SPP, the model effectively combines global and local features to further enhance the feature extraction. This paper validates the SCPE-YOLOv5s on a public dataset, demonstrating that it achieves a mean Average Precision of 88.29 %, outperforming the You Only Look Once version 5 small by 4.87 %. Additionally, the model achieves 57.4 frames per second. Therefore, SCPE-YOLOv5s provides a novel solution for fall event detection.






  • 文章类型: Journal Article
    A bridge disease identification approach based on an enhanced YOLO v3 algorithm is suggested to increase the accuracy of apparent disease detection of concrete bridges under complex backgrounds. First, the YOLO v3 network structure is enhanced to better accommodate the dense distribution and large variation of disease scale characteristics, and the detection layer incorporates the squeeze and excitation (SE) networks attention mechanism module and spatial pyramid pooling module to strengthen the semantic feature extraction ability. Secondly, CIoU with better localization ability is selected as the loss function for training. Finally, the K-means algorithm is used for anchor frame clustering on the bridge surface disease defects dataset. 1363 datasets containing exposed reinforcement, spalling, and water erosion damage of bridges are produced, and network training is done after manual labelling and data improvement in order to test the efficacy of the algorithm described in this paper. According to the trial results, the YOLO v3 model has enhanced more than the original model in terms of precision rate, recall rate, Average Precision (AP), and other indicators. Its overall mean Average Precision (mAP) value has also grown by 5.5%. With the RTX2080Ti graphics card, the detection frame rate increases to 84 Frames Per Second, enabling more precise and real-time bridge illness detection.






  • 文章类型: Journal Article
    Nuclear magnetic resonance (NMR) is a crucial technique for analyzing mixtures consisting of small molecules, providing non-destructive, fast, reproducible, and unbiased benefits. However, it is challenging to perform mixture identification because of the offset of chemical shifts and peak overlaps that often exist in mixtures such as plant flavors. Here, we propose a deep-learning-based mixture identification method (DeepMID) that can be used to identify plant flavors (mixtures) in a formulated flavor (mixture consisting of several plant flavors) without the need to know the specific components in the plant flavors. A pseudo-Siamese convolutional neural network (pSCNN) and a spatial pyramid pooling (SPP) layer were used to solve the problems due to their high accuracy and robustness. The DeepMID model is trained, validated, and tested on an augmented data set containing 50,000 pairs of formulated and plant flavors. We demonstrate that DeepMID can achieve excellent prediction results in the augmented test set: ACC = 99.58%, TPR = 99.48%, FPR = 0.32%; and two experimentally obtained data sets: one shows ACC = 97.60%, TPR = 92.81%, FPR = 0.78% and the other shows ACC = 92.31%, TPR = 80.00%, FPR = 0.00%. In conclusion, DeepMID is a reliable method for identifying plant flavors in formulated flavors based on NMR spectroscopy, which can assist researchers in accelerating the design of flavor formulations.






  • 文章类型: Journal Article
    Brain tumor segmentation from Magnetic Resonance Images (MRI) is considered a big challenge due to the complexity of brain tumor tissues, and segmenting these tissues from the healthy tissues is an even more tedious challenge when manual segmentation is undertaken by radiologists. In this paper, we have presented an experimental approach to emphasize the impact and effectiveness of deep learning elements like optimizers and loss functions towards a deep learning optimal solution for brain tumor segmentation. We evaluated our performance results on the most popular brain tumor datasets (MICCAI BraTS 2020 and RSNA-ASNR-MICCAI BraTS 2021). Furthermore, a new Bridged U-Net-ASPP-EVO was introduced that exploits Atrous Spatial Pyramid Pooling to enhance capturing multi-scale information to help in segmenting different tumor sizes, Evolving Normalization layers, squeeze and excitation residual blocks, and the max-average pooling for down sampling. Two variants of this architecture were constructed (Bridged U-Net_ASPP_EVO v1 and Bridged U-Net_ASPP_EVO v2). The best results were achieved using these two models when compared with other state-of-the-art models; we have achieved average segmentation dice scores of 0.84, 0.85, and 0.91 from variant1, and 0.83, 0.86, and 0.92 from v2 for the Enhanced Tumor (ET), Tumor Core (TC), and Whole Tumor (WT) tumor sub-regions, respectively, in the BraTS 2021validation dataset.






  • 文章类型: Journal Article
    Addressing the challenges of low detection precision and excessive parameter volume presented by the high resolution, significant scale variations, and complex backgrounds in UAV aerial imagery, this paper introduces MFP-YOLO, a lightweight detection algorithm based on YOLOv5s. Initially, a multipath inverse residual module is designed, and an attention mechanism is incorporated to manage the issues associated with significant scale variations and abundant interference from complex backgrounds. Then, parallel deconvolutional spatial pyramid pooling is employed to extract scale-specific information, enhancing multi-scale target detection. Furthermore, the Focal-EIoU loss function is utilized to augment the model\'s focus on high-quality samples, consequently improving training stability and detection accuracy. Finally, a lightweight decoupled head replaces the original model\'s detection head, accelerating network convergence speed and enhancing detection precision. Experimental results demonstrate that MFP-YOLO improved the mAP50 on the VisDrone 2019 validation and test sets by 12.9% and 8.0%, respectively, compared to the original YOLOv5s. At the same time, the model\'s parameter volume and weight size were reduced by 79.2% and 73.7%, respectively, indicating that MFP-YOLO outperforms other mainstream algorithms in UAV aerial imagery detection tasks.






  • 文章类型: Journal Article
    Brain tumor diagnosis has been a lengthy process, and automation of a process such as brain tumor segmentation speeds up the timeline. U-Nets have been a commonly used solution for semantic segmentation, and it uses a downsampling-upsampling approach to segment tumors. U-Nets rely on residual connections to pass information during upsampling; however, an upsampling block only receives information from one downsampling block. This restricts the context and scope of an upsampling block. In this paper, we propose SPP-U-Net where the residual connections are replaced with a combination of Spatial Pyramid Pooling (SPP) and Attention blocks. Here, SPP provides information from various downsampling blocks, which will increase the scope of reconstruction while attention provides the necessary context by incorporating local characteristics with their corresponding global dependencies. Existing literature uses heavy approaches such as the usage of nested and dense skip connections and transformers. These approaches increase the training parameters within the model which therefore increase the training time and complexity of the model. The proposed approach on the other hand attains comparable results to existing literature without changing the number of trainable parameters over larger dimensions such as 160 × 192 × 192. All in all, the proposed model scores an average dice score of 0.883 and a Hausdorff distance of 7.84 on Brats 2021 cross validation.






  • 文章类型: Journal Article
    Pork is the most widely consumed meat product in the world, and achieving accurate detection of individual pigs is of great significance for intelligent pig breeding and health monitoring. Improved pig detection has important implications for improving pork production and quality, as well as economics. However, most of the current approaches are based on manual labor, resulting in unfeasible performance. In order to improve the efficiency and effectiveness of individual pig detection, this paper describes the development of an attention module enhanced YOLOv3-SC model (YOLOv3-SPP-CBAM. SPP denotes the Spatial Pyramid Pooling module and CBAM indicates the Convolutional Block Attention Module). Specifically, leveraging the attention module, the network will extract much richer feature information, leading the improved performance. Furthermore, by integrating the SPP structured network, multi-scale feature fusion can be achieved, which makes the network more robust. On the constructed dataset of 4019 samples, the experimental results showed that the YOLOv3-SC network achieved 99.24% mAP in identifying individual pigs with a detection time of 16 ms. Compared with the other popular four models, including YOLOv1, YOLOv2, Faster-RCNN, and YOLOv3, the mAP of pig identification was improved by 2.31%, 1.44%, 1.28%, and 0.61%, respectively. The YOLOv3-SC proposed in this paper can achieve accurate individual detection of pigs. Consequently, this novel proposed model can be employed for the rapid detection of individual pigs on farms, and provides new ideas for individual pig detection.





