Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

3D Human Pose Estimation Based on 2D-3D Consistency with Synchronized Adversarial Training (2106.04274v4)

Published 8 Jun 2021 in cs.CV

Abstract: 3D human pose estimation from a single image is still a challenging problem despite the large amount of work that has been performed in this field. Generally, most methods directly use neural networks and ignore certain constraints (e.g., reprojection constraints, joint angle, and bone length constraints). While a few methods consider these constraints but train the network separately, they cannot effectively solve the depth ambiguity problem. In this paper, we propose a GAN-based model for 3D human pose estimation, in which a reprojection network is employed to learn the mapping of the distribution from 3D poses to 2D poses, and a discriminator is employed for 2D-3D consistency discrimination. We adopt a novel strategy to synchronously train the generator, the reprojection network and the discriminator. Furthermore, inspired by the typical kinematic chain space (KCS) matrix, we introduce a weighted KCS matrix and take it as one of the discriminator's inputs to impose joint angle and bone length constraints. The experimental results on Human3.6M show that our method significantly outperforms state-of-the-art methods in most cases.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. 2d human pose estimation: New benchmark and state of the art analysis. In Proceedings of the IEEE Conference on computer Vision and Pattern Recognition, pages 3686–3693, 2014.
  2. Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017.
  3. Lifting 2d human pose to 3d : A weakly supervised approach. In 2019 International Joint Conference on Neural Networks (IJCNN), 2019.
  4. Keep it smpl: Automatic estimation of 3d human pose and shape from a single image. In European conference on computer vision, pages 561–578. Springer, 2016.
  5. Human pose estimation via convolutional part heatmap regression. In European Conference on Computer Vision, pages 717–732. Springer, 2016.
  6. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7291–7299, 2017.
  7. Human pose estimation with iterative error feedback. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4733–4742, 2016.
  8. Unsupervised 3d pose estimation with geometric self-supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5714–5724, 2019.
  9. Svmac: Unsupervised 3d human pose estimation from a single image with single-view-multi-angle consistency. In 2021 International Conference on 3D Vision (3DV), pages 474–483. IEEE, 2021.
  10. Marker-less 3d human motion capture with monocular image sequence and height-maps. In European conference on computer vision, pages 20–36. Springer, 2016.
  11. Learning pose grammar to encode human body configuration for 3d pose estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  12. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  13. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  14. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. PMLR, 2015.
  15. Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE transactions on pattern analysis and machine intelligence, 36(7):1325–1339, 2013.
  16. Weakly-supervised 3d human pose learning via multi-view images in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5243–5252, 2020.
  17. A cnn-based 3d human pose estimation based on projection of depth and ridge data. Pattern Recognition, 106:107462, 2020.
  18. Unsupervised adversarial learning of 3d human pose from 2d joint locations. arXiv preprint arXiv:1803.08244, 2018.
  19. Weakly supervised generative network for multiple 3d human pose hypotheses. In 31st British Machine Vision Conference 2020, BMVC 2020, Virtual Event, UK, September 7-10, 2020. BMVA Press, 2020.
  20. Orinet: A fully convolutional network for 3d human pose estimation. arXiv preprint arXiv:1811.04989, 2018.
  21. Smplr: Deep learning based smpl reverse for 3d human pose and shape recovery. Pattern Recognition, 106:107472, 2020.
  22. A simple yet effective baseline for 3d human pose estimation. In Proceedings of the IEEE international conference on computer vision, pages 2640–2649, 2017.
  23. Monocular 3d human pose estimation in the wild using improved cnn supervision. In 2017 international conference on 3D vision (3DV), pages 506–516. IEEE, 2017.
  24. Vnect: Real-time 3d human pose estimation with a single rgb camera. ACM Transactions on Graphics (TOG), 36(4):1–14, 2017.
  25. Francesc Moreno-Noguer. 3d human pose estimation from a single image via distance matrix regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2823–2832, 2017.
  26. Stacked hourglass networks for human pose estimation. In European conference on computer vision, pages 483–499. Springer, 2016.
  27. Coarse-to-fine volumetric prediction for single-image 3d human pose. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7025–7034, 2017.
  28. Deepcut: Joint subset partition and labeling for multi person pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4929–4937, 2016.
  29. Unsupervised geometry-aware representation for 3d human pose estimation. In Proceedings of the European conference on computer vision (ECCV), pages 750–767, 2018.
  30. Unsupervised human pose estimation through transforming shape templates. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2484–2494, 2021.
  31. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
  32. Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5693–5703, 2019.
  33. Integral human pose regression. In Proceedings of the European conference on computer vision (ECCV), pages 529–545, 2018.
  34. Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1653–1660, 2014.
  35. Self-supervised learning of motion capture. Advances in Neural Information Processing Systems, 30, 2017.
  36. Adversarial inverse graphics networks: Learning 2d-to-3d lifting and image-to-image translation from unpaired supervision. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 4364–4372. IEEE, 2017.
  37. Repnet: Weakly supervised training of an adversarial reprojection network for 3d human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7782–7791, 2019.
  38. Canonpose: Self-supervised monocular 3d human pose estimation in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13294–13304, 2021.
  39. Drpose3d: Depth ranking in 3d human pose estimation. arXiv preprint arXiv:1805.08973, 2018.
  40. Single image 3d interpreter network. In European Conference on Computer Vision, pages 365–382. Springer, 2016.
  41. Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853, 2015.
  42. Deep kinematics analysis for monocular 3d human pose estimation. In Proceedings of the IEEE/CVF Conference on computer vision and Pattern recognition, pages 899–908, 2020.
  43. Graph stacked hourglass networks for 3d human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16105–16114, 2021.
  44. 3d human pose, shape and texture from low-resolution images and videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–1, 2021.
  45. 3d human pose estimation in the wild by adversarial learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5255–5264, 2018.
  46. Simpoe: Simulated character control for 3d human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7159–7169, 2021.
  47. Ray3d: ray-based 3d human pose estimation for monocular absolute 3d localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
  48. Sparse representation for 3d shape estimation: A convex relaxation approach. IEEE transactions on pattern analysis and machine intelligence, 39(8):1648–1661, 2016.
  49. Sparseness meets deepness: 3d human pose estimation from monocular video. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4966–4975, 2016.
  50. Towards 3d human pose estimation in the wild: a weakly-supervised approach. In Proceedings of the IEEE International Conference on Computer Vision, pages 398–407, 2017.
Citations (2)

Summary

We haven't generated a summary for this paper yet.