Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation (2403.11310v2)

Published 17 Mar 2024 in cs.CV

Abstract: 3D human pose data collected in controlled laboratory settings present challenges for pose estimators that generalize across diverse scenarios. To address this, domain generalization is employed. Current methodologies in domain generalization for 3D human pose estimation typically utilize adversarial training to generate synthetic poses for training. Nonetheless, these approaches exhibit several limitations. First, the lack of prior information about the target domain complicates the application of suitable augmentation through a single pose augmentor, affecting generalization on target domains. Moreover, adversarial training's discriminator tends to enforce similarity between source and synthesized poses, impeding the exploration of out-of-source distributions. Furthermore, the pose estimator's optimization is not exposed to domain shifts, limiting its overall generalization ability. To address these limitations, we propose a novel framework featuring two pose augmentors: the weak and the strong augmentors. Our framework employs differential strategies for generation and discrimination processes, facilitating the preservation of knowledge related to source poses and the exploration of out-of-source distributions without prior information about target poses. Besides, we leverage meta-optimization to simulate domain shifts in the optimization process of the pose estimator, thereby improving its generalization ability. Our proposed approach significantly outperforms existing methods, as demonstrated through comprehensive experiments on various benchmark datasets.Our code will be released at \url{https://github.com/davidpengucf/DAF-DG}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. A classification of 3r orthogonal manipulators by the topology of their workspace. In IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA’04. 2004, pages 1933–1938. IEEE, 2004.
  2. Global adaptation meets local generalization: Unsupervised domain adaptation for 3d human pose estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 14655–14665, 2023.
  3. Cascaded pyramid network for multi-person pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7103–7112, 2018.
  4. Adaptpose: Cross-dataset adaptation for 3d human pose estimation by learnable motion generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13075–13085, 2022.
  5. Detectron. https://github.com/facebookresearch/detectron, 2018.
  6. Poseaug: A differentiable pose augmentation framework for 3d human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8575–8584, 2021.
  7. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  8. Posegu: 3d human pose estimation with novel human pose generator and unbiased learning. Computer Vision and Image Understanding, 233:103715, 2023.
  9. Improved training of wasserstein gans. Advances in neural information processing systems, 30, 2017.
  10. Human poseitioning system (hps): 3d human pose estimation and self-localization in large scenes from body-mounted sensors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4318–4329, 2021.
  11. Part aware contrastive learning for self-supervised action recognition. In International Joint Conference on Artificial Intelligence, 2023.
  12. Dh-aug: Dh forward kinematics model driven augmentation for 3d human pose estimation. In European Conference on Computer Vision, pages 436–453. Springer, 2022.
  13. Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE transactions on pattern analysis and machine intelligence, 36(7):1325–1339, 2013.
  14. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  15. Vibe: Video inference for human body pose and shape estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5253–5263, 2020.
  16. Learning to generalize: Meta-learning for domain generalization. In Proceedings of the AAAI conference on artificial intelligence, 2018.
  17. Cee-net: Complementary end-to-end network for 3d human pose generation and estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1):1305–1313, 2023.
  18. Cascaded deep monocular 3d human pose estimation with evolutionary training data. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6173–6183, 2020.
  19. Posynda: Multi-hypothesis pose synthesis domain adaptation for robust 3d human pose estimation. In Proceedings of the ACM International Conference on Multimedia, 2023.
  20. Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.
  21. Hard no-box adversarial attack on skeleton-based human action recognition with skeleton-motion-informed gradient. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 4597–4606, 2023.
  22. Domain generalization using a mixture of multiple latent domains. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 11749–11756, 2020.
  23. Monocular 3d human pose estimation in the wild using improved cnn supervision. In 2017 international conference on 3D vision (3DV), pages 506–516. IEEE, 2017.
  24. 3d human pose estimation in video with temporal convolutions and semi-supervised training. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7753–7762, 2019.
  25. Qucheng Peng. Multi-source and Source-Private Cross-Domain Learning for Visual Recognition. PhD thesis, Purdue University, 2022.
  26. Rain: regularization on input and network for black-box domain adaptation. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 4118–4126, 2023a.
  27. Source-free domain adaptive human pose estimation. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 4803–4813. IEEE, 2023b.
  28. Gaitsada: Self-aligned domain adaptation for mmwave gait recognition. In 2023 IEEE 20th International Conference on Mobile Ad Hoc and Smart Systems (MASS), pages 218–226. IEEE, 2023.
  29. Learning to learn single domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12556–12565, 2020.
  30. Pose-driven deep convolutional model for person re-identification. In Proceedings of the IEEE international conference on computer vision, pages 3960–3969, 2017.
  31. Counterfactual risk minimization: Learning from logged bandit feedback. In International Conference on Machine Learning, pages 814–823. PMLR, 2015.
  32. Recovering accurate 3d human pose in the wild using imus and a moving camera. In Proceedings of the European conference on computer vision (ECCV), pages 601–617, 2018.
  33. Repnet: Weakly supervised training of an adversarial reprojection network for 3d human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7782–7791, 2019.
  34. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 43(10):3349–3364, 2020.
  35. Skeletonmae: Graph-based masked autoencoder for skeleton sequence pre-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 5606–5618, 2023.
  36. Mime: Human-aware 3d scene generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12965–12976, 2023.
  37. Mixste: Seq2seq mixed spatio-temporal encoder for 3d human pose estimation in video. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13232–13242, 2022.
  38. Learning to augment poses for 3d human pose estimation in images and videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(8):10012–10026, 2023.
  39. Semantic graph convolutional networks for 3d human pose regression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3425–3435, 2019.
  40. Maximum-entropy adversarial data augmentation for improved generalization and robustness. Advances in Neural Information Processing Systems, 33:14435–14447, 2020a.
  41. Poseformerv2: Exploring frequency domain for efficient and robust 3d human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8877–8886, 2023.
  42. Domain generalization via entropy regularization. Advances in Neural Information Processing Systems, 33:16096–16107, 2020b.
  43. Style-hallucinated dual consistency learning for domain generalized semantic segmentation. In European Conference on Computer Vision, pages 535–552. Springer, 2022.
  44. 3d human pose estimation with spatial and temporal transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11656–11665, 2021.
  45. Potter: Pooling attention transformer for efficient human mesh recovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023a.
  46. Feater: An efficient network for human reconstruction via feature map-based transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023b.
  47. Motionbert: A unified perspective on learning human motion representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Qucheng Peng (7 papers)
  2. Ce Zheng (45 papers)
  3. Chen Chen (753 papers)
Citations (13)

Summary

We haven't generated a summary for this paper yet.