• 文章类型: Journal Article
    The excessive use of electronic devices for prolonged periods has led to problems such as neck pain and pressure injury in sedentary people. If not detected and corrected early, these issues can cause serious risks to physical health. Detectors for generic objects cannot adequately capture such subtle neck behaviors, resulting in missed detections. In this paper, we explore a deep learning-based solution for detecting abnormal behavior of the neck and propose a model called NABNet that combines object detection based on YOLOv5s with pose estimation based on Lightweight OpenPose. NABNet extracts the detailed behavior characteristics of the neck from global to local and detects abnormal behavior by analyzing the angle of the data. We deployed NABNet on the cloud and edge devices to achieve remote monitoring and abnormal behavior alarms. Finally, we applied the resulting NABNet-based IoT system for abnormal behavior detection in order to evaluate its effectiveness. The experimental results show that our system can effectively detect abnormal neck behavior and raise alarms on the cloud platform, with the highest accuracy reaching 94.13%.






  • 文章类型: Journal Article
    Simultaneous Localization and Mapping (SLAM) is one of the key technologies with which to address the autonomous navigation of mobile robots, utilizing environmental features to determine a robot\'s position and create a map of its surroundings. Currently, visual SLAM algorithms typically yield precise and dependable outcomes in static environments, and many algorithms opt to filter out the feature points in dynamic regions. However, when there is an increase in the number of dynamic objects within the camera\'s view, this approach might result in decreased accuracy or tracking failures. Therefore, this study proposes a solution called YPL-SLAM based on ORB-SLAM2. The solution adds a target recognition and region segmentation module to determine the dynamic region, potential dynamic region, and static region; determines the state of the potential dynamic region using the RANSAC method with polar geometric constraints; and removes the dynamic feature points. It then extracts the line features of the non-dynamic region and finally performs the point-line fusion optimization process using a weighted fusion strategy, considering the image dynamic score and the number of successful feature point-line matches, thus ensuring the system\'s robustness and accuracy. A large number of experiments have been conducted using the publicly available TUM dataset to compare YPL-SLAM with globally leading SLAM algorithms. The results demonstrate that the new algorithm surpasses ORB-SLAM2 in terms of accuracy (with a maximum improvement of 96.1%) while also exhibiting a significantly enhanced operating speed compared to Dyna-SLAM.






  • 文章类型: Journal Article
    Vehicle detection is a research direction in the field of target detection and is widely used in intelligent transportation, automatic driving, urban planning, and other fields. To balance the high-speed advantage of lightweight networks and the high-precision advantage of multiscale networks, a vehicle detection algorithm based on a lightweight backbone network and a multiscale neck network is proposed. The mobile NetV3 lightweight network based on deep separable convolution is used as the backbone network to improve the speed of vehicle detection. The icbam attention mechanism module is used to strengthen the processing of the vehicle feature information detected by the backbone network to enrich the input information of the neck network. The bifpn and icbam attention mechanism modules are integrated into the neck network to improve the detection accuracy of vehicles of different sizes and categories. A vehicle detection experiment on the Ua-Detrac dataset verifies that the proposed algorithm can effectively balance vehicle detection accuracy and speed. The detection accuracy is 71.19%, the number of parameters is 3.8 MB, and the detection speed is 120.02 fps, which meets the actual requirements of the parameter quantity, detection speed, and accuracy of the vehicle detection algorithm embedded in the mobile device.






  • 文章类型: Journal Article
    The resolution of traffic congestion and personal safety issues holds paramount importance for human\'s life. The ability of an autonomous driving system to navigate complex road conditions is crucial. Deep learning has greatly facilitated machine vision perception in autonomous driving. Aiming at the problem of small target detection in traditional YOLOv5s, this paper proposes an optimized target detection algorithm. The C3 module on the algorithm\'s backbone is upgraded to the CBAMC3 module, introducing a novel GELU activation function and EfficiCIoU loss function, which accelerate convergence on position loss lbox, confidence loss lobj, and classification loss lcls, enhance image learning capabilities and address the issue of inaccurate detection of small targets by improving the algorithm. Testing with a vehicle-mounted camera on a predefined route effectively identifies road vehicles and analyzes depth position information. The avoidance model, combined with Pure Pursuit and MPC control algorithms, exhibits more stable variations in vehicle speed, front-wheel steering angle, lateral acceleration, etc., compared to the non-optimized version. The robustness of the driving system\'s visual avoidance functionality is enhanced, further ameliorating congestion issues and ensuring personal safety.






  • 文章类型: Journal Article
    In order to solve the problems of slow detection speed, large number of parameters and large computational volume of deep learning based gangue target detection method, we propose an improved algorithm for gangue target detection based on Yolov5s. First, the lightweight network EfficientVIT is used as the backbone network to increase the target detection speed. Second, C3_Faster replaces the C3 part in the HEAD module, which reduces the model complexity. once again, the 20 × 20 feature map branch in the Neck region is deleted, which reduces the model complexity; thirdly, the CIOU loss function is replaced by the Mpdiou loss function. The introduction of the SE attention mechanism makes the model pay more attention to critical features to improve detection performance. Experimental results show that the improved model size of the coal gang detection algorithm reduces the compression by 77.8%, the number of parameters by 78.3% the computational cost is reduced by 77.8% and the number of frames is reduced by 30.6%, which can be used as a reference for intelligent coal gangue classification.






  • 文章类型: Journal Article
    In complex industrial environments, accurate recognition and localization of industrial targets are crucial. This study aims to improve the precision and accuracy of object detection in industrial scenarios by effectively fusing feature information at different scales and levels, and introducing edge detection head algorithms and attention mechanisms. We propose an improved YOLOv5-based algorithm for industrial object detection. Our improved algorithm incorporates the Crossing Bidirectional Feature Pyramid (CBiFPN), effectively addressing the information loss issue in multi-scale and multi-level feature fusion. Therefore, our method can enhance detection performance for objects of varying sizes. Concurrently, we have integrated the attention mechanism (C3_CA) into YOLOv5s to augment feature expression capabilities. Furthermore, we introduce the Edge Detection Head (EDH) method, which is adept at tackling detection challenges in scenes with occluded objects and cluttered backgrounds by merging edge information and amplifying it within the features. Experiments conducted on the modified ITODD dataset demonstrate that the original YOLOv5s algorithm achieves 82.11% and 60.98% on mAP@0.5 and mAP@0.5:0.95, respectively, with its precision and recall being 86.8% and 74.75%, respectively. The performance of the modified YOLOv5s algorithm on mAP@0.5 and mAP@0.5:0.95 has been improved by 1.23% and 1.44%, respectively, and the precision and recall have been enhanced by 3.68% and 1.06%, respectively. The results show that our method significantly boosts the accuracy and robustness of industrial target recognition and localization.






  • 文章类型: Journal Article
    Steel strip is an important raw material for the engineering, automotive, shipbuilding, and aerospace industries. However, during the production process, the surface of the steel strip is prone to cracks, pitting, and other defects that affect its appearance and performance. It is important to use machine vision technology to detect defects on the surface of a steel strip in order to improve its quality. To address the difficulties in classifying the fine-grained features of strip steel surface images and to improve the defect detection rate, we propose an improved YOLOv5s model called YOLOv5s-FPD (Fine Particle Detection). The SPPF-A (Spatial Pyramid Pooling Fast-Advance) module was constructed to adjust the spatial pyramid structure, and the ASFF (Adaptively Spatial Feature Fusion) and CARAFE (Content-Aware ReAssembly of FEatures) modules were introduced to improve the feature extraction and fusion capabilities of strip images. The CSBL (Convolutional Separable Bottleneck) module was also constructed, and the DCNv2 (Deformable ConvNets v2) module was introduced to improve the model\'s lightweight properties. The CBAM (Convolutional Block Attention Module) attention module is used to extract key and important information, further improving the model\'s feature extraction capability. Experimental results on the NEU_DET (NEU surface defect database) dataset show that YOLOv5s-FPD improves the mAP50 accuracy by 2.6% before data enhancement and 1.8% after SSIE (steel strip image enhancement) data enhancement, compared to the YOLOv5s prototype. It also improves the detection accuracy of all six defects in the dataset. Experimental results on the VOC2007 public dataset demonstrate that YOLOv5s-FPD improves the mAP50 accuracy by 4.6% before data enhancement, compared to the YOLOv5s prototype. Overall, these results confirm the validity and usefulness of the proposed model.






  • 文章类型: Journal Article
    UNASSIGNED: Recognizing wheat ears plays a crucial role in predicting wheat yield. Employing deep learning methods for wheat ears identification is the mainstream method in current research and applications. However, such methods still face challenges, such as high computational parameter volume, large model weights, and slow processing speeds, making it difficult to apply them for real-time identification tasks on limited hardware resources in the wheat field. Therefore, exploring lightweight wheat ears detection methods for real-time recognition holds significant importance.
    UNASSIGNED: This study proposes a lightweight method for detecting and counting wheat ears based on YOLOv5s. It utilizes the ShuffleNetV2 lightweight convolutional neural network to optimize the YOLOv5s model by reducing the number of parameters and simplifying the complexity of the calculation processes. In addition, a lightweight upsampling operator content-aware reassembly of features is introduced in the feature pyramid structure to eliminate the impact of the lightweight process on the model detection performance. This approach aims to improve the spatial resolution of the feature images, enhance the effectiveness of the perceptual field, and reduce information loss. Finally, by introducing the dynamic target detection head, the shape of the detection head and the feature extraction strategy can be dynamically adjusted, and the detection accuracy can be improved when encountering wheat ears with large-scale changes, diverse shapes, or significant orientation variations.
    UNASSIGNED: This study uses the global wheat head detection dataset and incorporates the local experimental dataset to improve the robustness and generalization of the proposed model. The weight, FLOPs and mAP of this model are 2.9 MB, 2.5 * 109 and 94.8%, respectively. The linear fitting determination coefficients R2 for the model test result and actual value of global wheat head detection dataset and local experimental Site are 0.94 and 0.97, respectively. The improved lightweight model can better meet the requirements of precision wheat ears counting and play an important role in embedded systems, mobile devices, or other hardware systems with limited computing resources.






  • 文章类型: Journal Article
    UNASSIGNED: With continuously increasing labor costs, an urgent need for automated apple- Qpicking equipment has emerged in the agricultural sector. Prior to apple harvesting, it is imperative that the equipment not only accurately locates the apples, but also discerns the graspability of the fruit. While numerous studies on apple detection have been conducted, the challenges related to determining apple graspability remain unresolved.
    UNASSIGNED: This study introduces a method for detecting multi-occluded apples based on an enhanced YOLOv5s model, with the aim of identifying the type of apple occlusion in complex orchard environments and determining apple graspability. Using bootstrap your own atent(BYOL) and knowledge transfer(KT) strategies, we effectively enhance the classification accuracy for multi-occluded apples while reducing data production costs. A selective kernel (SK) module is also incorporated, enabling the network model to more precisely identify various apple occlusion types. To evaluate the performance of our network model, we define three key metrics: APGA, APTUGA, and APUGA, representing the average detection accuracy for graspable, temporarily ungraspable, and ungraspable apples, respectively.
    UNASSIGNED: Experimental results indicate that the improved YOLOv5s model performs exceptionally well, achieving detection accuracies of 94.78%, 93.86%, and 94.98% for APGA, APTUGA, and APUGA, respectively.
    UNASSIGNED: Compared to current lightweight network models such as YOLOX-s and YOLOv7s, our proposed method demonstrates significant advantages across multiple evaluation metrics. In future research, we intend to integrate fruit posture and occlusion detection to f]urther enhance the visual perception capabilities of apple-picking equipment.






  • 文章类型: Journal Article
    The hoist cage is used to lift miners in a coal mine\'s auxiliary shaft. Monitoring miners\' unsafe behaviors and their status in the hoist cage is crucial to production safety in coal mines. In this study, a visual detection model is proposed to estimate the number and categories of miners, and to identify whether the miners are wearing helmets and whether they have fallen in the hoist cage. A dataset with eight categories of miners\' statuses in hoist cages was developed for training and validating the model. Using the dataset, the classical models were trained for comparison, from which the YOLOv5s model was selected to be the basic model. Due to small-sized targets, poor lighting conditions, and coal dust and shelter, the detection accuracy of the Yolov5s model was only 89.2%. To obtain better detection accuracy, k-means++ clustering algorithm, a BiFPN-based feature fusion network, the convolutional block attention module (CBAM), and a CIoU loss function were proposed to improve the YOLOv5s model, and an attentional multi-scale cascaded feature fusion-based YOLOv5s model (AMCFF-YOLOv5s) was subsequently developed. The training results on the self-built dataset indicate that its detection accuracy increased to 97.6%. Moreover, the AMCFF-YOLOv5s model was proven to be robust to noise and light.





