Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PoseGraphNet++: Enriching 3D Human Pose with Orientation Estimation (2308.11440v2)

Published 22 Aug 2023 in cs.CV

Abstract: Existing skeleton-based 3D human pose estimation methods only predict joint positions. Although the yaw and pitch of bone rotations can be derived from joint positions, the roll around the bone axis remains unresolved. We present PoseGraphNet++ (PGN++), a novel 2D-to-3D lifting Graph Convolution Network that predicts the complete human pose in 3D including joint positions and bone orientations. We employ both node and edge convolutions to utilize the joint and bone features. Our model is evaluated on multiple datasets using both position and rotation metrics. PGN++ performs on par with the state-of-the-art (SoA) on the Human3.6M benchmark. In generalization experiments, it achieves the best results in position and matches the SoA in orientation, showcasing a more balanced performance than the current SoA. PGN++ exploits the mutual relationship of joints and bones resulting in significantly \SB{improved} position predictions, as shown by our ablation results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. B. Tekin, I. Katircioglu, M. Salzmann, V. Lepetit, and P. Fua, “Structured Prediction of 3D Human Pose with Deep Neural Networks,” in British Machine Vision Conference (BMVC), 2016.
  2. D. Mehta, H. Rhodin, D. Casas, P. Fua, O. Sotnychenko, W. Xu, and C. Theobalt, “Monocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision,” in 2017 International Conference on 3D Vision.   IEEE, 2017, pp. 506–516.
  3. G. Pavlakos, X. Zhou, and K. Daniilidis, “Ordinal Depth Supervision for 3D Human Pose Estimation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition.   IEEE, 2018, pp. 7307–7316.
  4. J. Martinez, R. Hossain, J. Romero, and J. J. Little, “A simple yet effective baseline for 3D human pose estimation,” in IEEE International Conference on Computer Vision, 2017, pp. 2640–2649.
  5. B. Doosti, S. Naha, M. Mirbagheri, and D. J. Crandall, “Hope-net: A graph-based model for hand-object pose estimation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 6608–6617.
  6. S. Banik, A. M. GarcÍa, and A. Knoll, “3D Human Pose Regression Using Graph Convolutional Network,” in IEEE International Conference on Image Processing, 2021, pp. 924–928.
  7. T. Xu and W. Takano, “Graph stacked hourglass networks for 3D human pose estimation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16 105–16 114.
  8. A. Kanazawa, M. J. Black, D. W. Jacobs, and J. Malik, “End-to-end Recovery of Human Shape and Pose,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7122–7131.
  9. H. Zhang, Y. Tian, X. Zhou, W. Ouyang, Y. Liu, L. Wang, and Z. Sun, “PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop,” in IEEE International Conference on Computer Vision, 2021, pp. 11 446–11 456.
  10. M. Loper, N. Mahmood, J. Romero, G. Pons-Moll, and M. J. Black, “SMPL: a skinned multi-person linear model,” ACM Transactions on Graphics, vol. 34, pp. 1–16, 2015.
  11. G. Pavlakos, V. Choutas, N. Ghorbani, T. Bolkart, A. A. A. Osman, D. Tzionas, and M. J. Black, “Expressive body capture: 3d hands, face, and body from a single image,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10 975–10 985.
  12. D. Tome, C. Russell, and L. Agapito, “Lifting from the deep: Convolutional 3D pose estimation from a single image,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 2500–2509.
  13. W. Zhao, W. Wang, and Y. Tian, “GraFormer: Graph-Oriented Transformer for 3D Pose Estimation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 20 438–20 447.
  14. Y. Cai, L. Ge, J. Liu, J. Cai, T.-J. Cham, J. Yuan, and N. M. Thalmann, “Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks,” in IEEE International Conference on Computer Vision, 2019, pp. 2272–2281.
  15. C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu, “Human3.6m: Large scale datasets and predictive methods for 3D human sensing in natural environments,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1325–1339, 2013.
  16. T. von Marcard, R. Henschel, M. Black, B. Rosenhahn, and G. Pons-Moll, “Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera,” in European Conference on Computer Vision, 2018, pp. 601–617.
  17. J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI Open, pp. 57–81, 2020.
  18. T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks,” in International Conference on Learning Representations, 2017.
  19. K. Liu, R. Ding, Z. Zou, L. Wang, and W. Tang, “A comprehensive study of weight sharing in graph networks for 3D human pose estimation,” in European Conference on Computer Vision, 2020, pp. 318–334.
  20. Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph CNN for learning on point clouds,” ACM Transactions on Graphics, pp. 1–12, 2019.
  21. X. Jiang, R. Zhu, S. Li, and P. Ji, “Co-embedding of nodes and edges with graph neural networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 7075–7086, 2020.
  22. G. Pavlakos, X. Zhou, K. G. Derpanis, and K. Daniilidis, “Coarse-to-fine volumetric prediction for single-image 3D human pose,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 7025–7034.
  23. D. Pavllo, E. Zürich, C. Feichtenhofer, D. Grangier, G. Brain, and M. Auli, “3D human pose estimation in video with temporal convolutions and semi-supervised training,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7753–7762.
  24. Y. Zhan, F. Li, R. Weng, and W. Choi, “Ray3d: ray-based 3d human pose estimation for monocular absolute 3d localization,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 116–13 125.
  25. L. Zhao, X. Peng, Y. Tian, M. Kapadia, and D. N. Metaxas, “Semantic Graph Convolutional Networks for 3D Human Pose Regression,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3425–3435.
  26. J. Li, C. Xu, Z. Chen, S. Bian, L. Yang, and C. Lu, “Hybrik: A hybrid analytical-neural inverse kinematics solution for 3d human pose and shape estimation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3383–3393.
  27. N. Kolotouros, G. Pavlakos, M. J. Black, and K. Daniilidis, “Learning to reconstruct 3d human pose and shape via model-fitting in the loop,” in IEEE International Conference on Computer Vision, 2019, pp. 2252–2261.
  28. Z. Yu, J. Wang, J. Xu, B. Ni, C. Zhao, M. Wang, and W. Zhang, “Skeleton2Mesh: Kinematics Prior Injected Unsupervised Human Mesh Recovery,” in IEEE International Conference on Computer Vision, 2021.
  29. C. Luo, X. Chu, and A. Yuille, “OriNet: A Fully Convolutional Network for 3D Human Pose Estimation,” in British Machine Vision Conference, 2018.
  30. M. A. Fisch and R. Clark, “Orientation keypoints for 6D human pose estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 10 145–10 158, 2021.
  31. Y. Zhou, C. Barnes, J. Lu, J. Yang, and H. Li, “On the continuity of rotation representations in neural networks,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5745–5753.
  32. J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
  33. D. Q. Huynh, “Metrics for 3D rotations: Comparison and analysis,” Journal of Mathematical Imaging and Vision, pp. 155–164, 2009.
  34. Y. Xiang, T. Schmidt, V. Narayanan, and D. Fox, “PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes,” Robotics: Science and Systems (RSS), 2018.
  35. Y. Chen, Z. Wang, Y. Peng, Z. Zhang, G. Yu, and J. Sun, “Cascaded pyramid network for multi-person pose estimation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7103–7112.
  36. W. Yang, W. Ouyang, X. Wang, J. S. J. Ren, H. Li, and X. Wang, “3D Human Pose Estimation in the Wild by Adversarial Learning,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
  37. K. Zhou, X. Han, N. Jiang, K. Jia, and J. Lu, “HEMlets Pose: Learning Part-Centric Heatmap Triplets for Accurate 3D Human Pose Estimation,” in IEEE International Conference on Computer Vision, 2019, pp. 2344–2353.

Summary

We haven't generated a summary for this paper yet.