Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the representation and methodology for wide and short range head pose estimation (2401.05807v1)

Published 11 Jan 2024 in cs.CV

Abstract: Head pose estimation (HPE) is a problem of interest in computer vision to improve the performance of face processing tasks in semi-frontal or profile settings. Recent applications require the analysis of faces in the full 360{\deg} rotation range. Traditional approaches to solve the semi-frontal and profile cases are not directly amenable for the full rotation case. In this paper we analyze the methodology for short- and wide-range HPE and discuss which representations and metrics are adequate for each case. We show that the popular Euler angles representation is a good choice for short-range HPE, but not at extreme rotations. However, the Euler angles' gimbal lock problem prevents them from being used as a valid metric in any setting. We also revisit the current cross-data set evaluation methodology and note that the lack of alignment between the reference systems of the training and test data sets negatively biases the results of all articles in the literature. We introduce a procedure to quantify this misalignment and a new methodology for cross-data set HPE that establishes new, more accurate, SOTA for the 300W-LP|Biwi benchmark. We also propose a generalization of the geodesic angular distance metric that enables the construction of a loss that controls the contribution of each training sample to the optimization of the model. Finally, we introduce a wide range HPE benchmark based on the CMU Panoptic data set.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Head pose estimation: An extensive survey on recent techniques and applications. Pattern Recognition, 127:108591, 2022.
  2. img2pose: Face alignment and detection via 6dof, face pose estimation. In Proc. CVPR, pages 7617–7627, 2021.
  3. Web-shaped model for head pose estimation: An approach for best exemplar selection. IEEE Trans. on Image Processing, 29:5457–5468, 2020.
  4. Fast quadtree-based pose estimation for security applications using face biometrics. In Proc. Network and System Security, pages 160–173, 2018.
  5. Biternion nets: Continuous head pose regression from discrete training labels. Pattern Recognition, 9358:157–168, 2015.
  6. A vector-based representation to enhance head pose estimation. In Proc. WACV, pages 1187–1196, 2021.
  7. Random forests for real time 3D face analysis. IJCV, 101(3):437–458, 2013.
  8. Wing loss for robust facial landmark localisation with convolutional neural networks. In Proc. CVPR, pages 2235–2245, 2018.
  9. Rotation averaging. IJCV, 103(3):267–305, 2013.
  10. 6D rotation representation for unconstrained head pose estimation. In Proc. International Conference on Image Processing, pages 2496–2500, 2022.
  11. QuatNet: Quaternion-based head pose estimation with multi-regression loss. IEEE Trans. on Multimedia, 21(4):1035–1046, 2019.
  12. Du Q. Huynh. Metrics for 3D rotations: Comparison and analysis. J. Math. Imaging Vis., 35(2):155–164, 2009.
  13. Challenges in head pose estimation of drivers in naturalistic recordings using existing tools. In Proc. IEEE International Conference on Intelligent Transportation Systems, pages 1–6, 2017.
  14. Uni6D: A unified CNN framework without projection breakdown for 6D pose estimation. In Proc. CVPR, pages 11164–11174, 2022.
  15. Panoptic studio: A massively multiview system for social interaction capture. PAMI, 41(1):190–204, 2017.
  16. Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In Proc. ICCVW, pages 2144–2151, 2011.
  17. DSFNet: Dual space fusion network for occlusion-robust 3D dense face alignment. In Proc. CVPR, pages 4531–4540, 2023.
  18. Bridging composite and real: Towards end-to-end deep image matting. IJCV, 130(2):246–266, 2022.
  19. Learning a model of facial shape and expression from 4D scans. ACM Trans. on Graphics, 36(6):194:1–194:17, 2017.
  20. MFDNet: Collaborative poses perception and matrix fisher distribution for head pose estimation. IEEE Trans. on Multimedia, 24:2449–2460, 2021.
  21. Jonathan Manton. A globally convergent numerical algorithm for computing the centre of mass on compact lie groups. In Proc. International Conference on Control, Automation, Robotics and Vision, pages 2211–2216, 2004.
  22. DAD-3DHeads: A large-scale dense, accurate and diverse dataset for 3D head alignment from a single image. In Proc. CVPR, pages 20910–20920, 2022.
  23. A 3D face model for pose and illumination invariant face recognition. In Proc. IEEE International Conference on Advanced Video and Signal Based Surveillance, pages 296–301, 2009.
  24. Focal length and object pose estimation via render and compare. In Proc. CVPR, pages 3815–3824, 2022.
  25. Wide range head pose estimation using a single RGB camera for intelligent surveillance. IEEE Sensors Journal, 22(11):11112–11121, 2022.
  26. Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. PAMI, 41(1):121–135, 2019.
  27. Fine-grained head pose estimation without keypoints. In Proc. CVPRW, pages 2074–2083, 2018.
  28. 300 faces in-the-wild challenge: database and results. Image and Vision Computing, 47:3–18, 2016.
  29. MobileNetV2: Inverted residuals and linear bottlenecks. In Proc. CVPR, pages 4510–4520, 2018.
  30. Rotation augmentation for head pose estimation problem. In 2021 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT), pages 308–311, 2021.
  31. Unbiased look at dataset bias. In Proc. CVPR, pages 1521–1528, 2011.
  32. Multi-task head pose estimation in-the-wild. PAMI, 43(8):2874–2881, 2021.
  33. FERA 2017 - addressing head pose in the third facial expression recognition and analysis challenge. In Proc. International Conference on Automatic Face and Gesture Recognition, pages 839–847, 2017.
  34. WIDER FACE: A face detection benchmark. In Proc. CVPR, pages 5525–5533, 2016.
  35. FSA-Net: Learning fine-grained structure aggregation for head pose estimation from a single image. In Proc. CVPR, pages 1087–1096, 2019.
  36. TokenHPE: Learning orientation tokens for efficient head pose estimation via transformers. In Proc. CVPR, pages 8897–8906, 2023.
  37. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10):1499–1503, 2016.
  38. On the continuity of rotation representations in neural networks. In Proc. CVPR, pages 5745–5753, 2019.
  39. WHENet: Real-time fine-grained estimation for wide range head pose. In Proc. BMVC, 2020.
  40. Face alignment in full pose range: A 3D total solution. PAMI, 41(1):78–92, 2019.
Citations (7)

Summary

We haven't generated a summary for this paper yet.