Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Latent Embedding Clustering for Occlusion Robust Head Pose Estimation (2403.20251v1)

Published 29 Mar 2024 in cs.CV

Abstract: Head pose estimation has become a crucial area of research in computer vision given its usefulness in a wide range of applications, including robotics, surveillance, or driver attention monitoring. One of the most difficult challenges in this field is managing head occlusions that frequently take place in real-world scenarios. In this paper, we propose a novel and efficient framework that is robust in real world head occlusion scenarios. In particular, we propose an unsupervised latent embedding clustering with regression and classification components for each pose angle. The model optimizes latent feature representations for occluded and non-occluded images through a clustering term while improving fine-grained angle predictions. Experimental evaluation on in-the-wild head pose benchmark datasets reveal competitive performance in comparison to state-of-the-art methodologies with the advantage of having a significant data reduction. We observe a substantial improvement in occluded head pose estimation. Also, an ablation study is conducted to ascertain the impact of the clustering term within our proposed framework.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. img2pose: Face alignment and detection via 6dof, face pose estimation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7613–7623, 2021.
  2. Holotumour: 6dof phantom head pose estimation based deep learning and brain tumour segmentation for ar visualisation and interaction. IEEE Sensors Journal, 2023.
  3. Poseidon: Face-from-depth for driver pose estimation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5494–5503. IEEE, 2017.
  4. Vision augmented robot feeding. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, September 2018.
  5. Hhp-net: A light heteroscedastic neural network for head pose estimation with uncertainty. 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3341–3350, 2021.
  6. 2d image head pose estimation via latent space regression under occlusion settings. Pattern Recognition, 137:109288, 2023.
  7. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
  8. N. Dhingra. Lwposr: Lightweight efficient fine grained head pose estimation. 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1204–1214, 2022.
  9. Real-Time Head Pose Estimation by Tracking and Detection of Keypoints and Facial Landmarks, pages 326–349. 07 2019.
  10. Driver distraction using visual-based sensors and algorithms. Sensors, 16(11), 2016.
  11. Towards fast, accurate and stable 3d dense face alignment. In A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, editors, Computer Vision – ECCV 2020, pages 152–168, Cham, 2020. Springer International Publishing.
  12. Improved deep embedded clustering with local structure preservation. In Ijcai, volume 17, pages 1753–1759, 2017.
  13. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  14. 6d rotation representation for unconstrained head pose estimation. In 2022 IEEE International Conference on Image Processing (ICIP), pages 2496–2500, 2022.
  15. W. Hoff and T. Vincent. Analysis of head pose accuracy in augmented reality. IEEE Transactions on Visualization and Computer Graphics, 6(4):319–334, 2000.
  16. Quatnet: Quaternion-based head pose estimation with multiregression loss. IEEE Transactions on Multimedia, 21(4):1035–1046, 2019.
  17. Accurate head pose estimation using image rectification and a lightweight convolutional neural network. IEEE Transactions on Multimedia, pages 1–1, 2022.
  18. J. MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA, 1967.
  19. K. S. Mader. Biwi Kinect Head Pose Database, 2018.
  20. Dad-3dheads: A large-scale dense, accurate and diverse dataset for 3d head alignment from a single image. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2022.
  21. Deep clustering: A comprehensive survey. arXiv preprint arXiv:2210.04142, 2022.
  22. Fine-grained head pose estimation without keypoints. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 2155–215509, 2018.
  23. Tracking gaze direction from far-field surveillance cameras. In 2011 IEEE Workshop on Applications of Computer Vision (WACV), pages 519–526, 2011.
  24. Contrastive deep embedded clustering. Neurocomputing, 514:13–20, 2022.
  25. Tracking the visual focus of attention for a varying number of wandering people. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7):1212–1229, 2008.
  26. R. L. Thorndike. Who belongs in the family? Psychometrika, 18(4):267–276, 1953.
  27. L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  28. Driver’s head pose and gaze zone estimation based on multi-zone templates registration and multi-frame point cloud fusion. Sensors, 22:3154, 04 2022.
  29. M. Wenzel and W. Schiffmann. Head pose estimation of partially occluded faces. pages 353– 360, 06 2005.
  30. Simultaneous facial landmark detection, pose and deformation estimation under facial occlusion. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5719–5728, 2017.
  31. Y. Wu and Q. Ji. Robust facial landmark detection under significant head poses and occlusion. 2015 IEEE International Conference on Computer Vision (ICCV), pages 3658–3666, 2015.
  32. Unsupervised deep embedding for clustering analysis. In International conference on machine learning, pages 478–487. PMLR, 2016.
  33. Head pose estimation using deep neural networks and 3d point cloud. Pattern Recognition, 121:108210, 07 2021.
  34. Fsa-net: Learning fine-grained structure aggregation for head pose estimation from a single image. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1087–1096, 2019.
  35. Ssr-net: A compact soft stagewise regression network for age estimation. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pages 1078–1084. International Joint Conferences on Artificial Intelligence Organization, 7 2018.
  36. Towards large-pose face frontalization in the wild. 2017 IEEE International Conference on Computer Vision (ICCV), pages 4010–4019, 2017.
  37. Single image-based head pose estimation with spherical parametrization and 3d morphing. Pattern Recognition, 103:107316, 02 2020.
  38. Y. Zhou and J. Gregson. Whenet: Real-time fine-grained estimation for wide range head pose. In British Machine Vision Conference (BMVC 2020), 2020.
  39. Face alignment across large poses: A 3d solution. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 146–155, 2016.

Summary

We haven't generated a summary for this paper yet.