关键词: YOLO-Pose deep learning drones human posture target detection

Mesh : Humans Algorithms Posture / physiology Unmanned Aerial Devices Image Processing, Computer-Assisted / methods

来  源:   DOI:10.3390/s24103036   PDF(Pubmed)

Abstract:
In response to the numerous challenges faced by traditional human pose recognition methods in practical applications, such as dense targets, severe edge occlusion, limited application scenarios, complex backgrounds, and poor recognition accuracy when targets are occluded, this paper proposes a YOLO-Pose algorithm for human pose estimation. The specific improvements are divided into four parts. Firstly, in the Backbone section of the YOLO-Pose model, lightweight GhostNet modules are introduced to reduce the model\'s parameter count and computational requirements, making it suitable for deployment on unmanned aerial vehicles (UAVs). Secondly, the ACmix attention mechanism is integrated into the Neck section to improve detection speed during object judgment and localization. Furthermore, in the Head section, key points are optimized using coordinate attention mechanisms, significantly enhancing key point localization accuracy. Lastly, the paper improves the loss function and confidence function to enhance the model\'s robustness. Experimental results demonstrate that the improved model achieves a 95.58% improvement in mAP50 and a 69.54% improvement in mAP50-95 compared to the original model, with a reduction of 14.6 M parameters. The model achieves a detection speed of 19.9 ms per image, optimized by 30% and 39.5% compared to the original model. Comparisons with other algorithms such as Faster R-CNN, SSD, YOLOv4, and YOLOv7 demonstrate varying degrees of performance improvement.
摘要:
针对传统人体姿态识别方法在实际应用中面临的诸多挑战,比如密集的目标,严重的边缘遮挡,有限的应用场景,复杂的背景,当目标被遮挡时,识别精度较差,提出了一种用于人体姿态估计的YOLO-Pose算法。具体的改进分为四个部分。首先,在YOLO-Pose模型的主干部分,引入了轻量级的GhostNet模块,以减少模型的参数计数和计算要求,使其适合部署在无人驾驶飞行器(UAV)上。其次,ACmix注意机制集成到颈部部分,以提高物体判断和定位过程中的检测速度。此外,在头部部分,使用协调注意力机制优化关键点,显著提高了关键点定位精度。最后,改进了损失函数和置信度函数,增强了模型的鲁棒性。实验结果表明,改进后的模型与原模型相比,mAP50提高了95.58%,mAP50-95提高了69.54%,参数减少了14.6M。该模型实现了每幅图像19.9ms的检测速度,与原始模型相比优化了30%和39.5%。与其他算法的比较,如更快的R-CNN,SSD,YOLOv4和YOLOv7表现出不同程度的性能改善。
公众号