关键词: 2DHeadPose Head pose estimation Label smoothing Landmark-based Landmark-free

Mesh : Humans Algorithms Face Software

来  源:   DOI:10.1016/j.neunet.2022.12.021

Abstract:
Head pose estimation is one of the essential tasks in computer vision, which predicts the Euler angles of the head in an image. In recent years, CNN-based methods for head pose estimation have achieved excellent performance. Their training relies on RGB images providing facial landmarks or depth images from RGBD cameras. However, labeling facial landmarks is complex for large angular head poses in RGB images, and RGBD cameras are unsuitable for outdoor scenes. We propose a simple and effective annotation method for the head pose in RGB images. The novelty method uses a 3D virtual human head to simulate the head pose in the RGB image. The Euler angle can be calculated from the change in coordinates of the 3D virtual head. We then create a dataset using our annotation method: 2DHeadPose dataset, which contains a rich set of attributes, dimensions, and angles. Finally, we propose Gaussian label smoothing to suppress annotation noises and reflect inter-class relationships. A baseline approach is established using Gaussian label smoothing. Experiments demonstrate that our annotation method, datasets, and Gaussian label smoothing are very effective. Our baseline approach surpasses most current state-of-the-art methods. The annotation tool, dataset, and source code are publicly available at https://github.com/youngnuaa/2DHeadPose.
摘要:
头部姿态估计是计算机视觉中的重要任务之一,预测图像中头部的欧拉角。近年来,用于头部姿态估计的基于CNN的方法已经取得了优异的性能。他们的训练依赖于RGB图像,提供来自RGBD相机的面部标志或深度图像。然而,标记面部标志对于RGB图像中的大角度头部姿势是复杂的,和RGBD摄像机不适合户外场景。针对RGB图像中的头部姿态,提出了一种简单有效的标注方法。新颖性方法使用3D虚拟人头部来模拟RGB图像中的头部姿势。欧拉角可以根据3D虚拟头部的坐标变化来计算。然后,我们使用我们的注释方法创建一个数据集:2DHeadPose数据集,其中包含一组丰富的属性,尺寸,和角度。最后,我们提出高斯标签平滑来抑制注释噪声并反映类间关系。使用高斯标签平滑建立基线方法。实验证明,我们的标注方法,数据集,和高斯标签平滑非常有效。我们的基线方法超越了目前最先进的方法。注释工具,数据集,和源代码公开在https://github.com/youngnuaa/2DHeadPose。
公众号