A comprehensive framework for occluded human pose estimation (2401.00155v2)
Abstract: Occlusion presents a significant challenge in human pose estimation. The challenges posed by occlusion can be attributed to the following factors: 1) Data: The collection and annotation of occluded human pose samples are relatively challenging. 2) Feature: Occlusion can cause feature confusion due to the high similarity between the target person and interfering individuals. 3) Inference: Robust inference becomes challenging due to the loss of complete body structural information. The existing methods designed for occluded human pose estimation usually focus on addressing only one of these factors. In this paper, we propose a comprehensive framework DAG (Data, Attention, Graph) to address the performance degradation caused by occlusion. Specifically, we introduce the mask joints with instance paste data augmentation technique to simulate occlusion scenarios. Additionally, an Adaptive Discriminative Attention Module (ADAM) is proposed to effectively enhance the features of target individuals. Furthermore, we present the Feature-Guided Multi-Hop GCN (FGMP-GCN) to fully explore the prior knowledge of body structure and improve pose estimation results. Through extensive experiments conducted on three benchmark datasets for occluded human pose estimation, we demonstrate that the proposed method outperforms existing methods. Code and data will be publicly available.
- “Learning semantic-aligned action representation,” IEEE transactions on neural networks and learning systems, vol. 29, no. 8, pp. 3715–3725, 2017.
- “Towards real-time physical human-robot interaction using skeleton information and hand gestures,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, pp. 1–6.
- “Diverse part discovery: Occluded person re-identification with part-aware transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2898–2907.
- “Estimating human pose efficiently by parallel pyramid networks,” IEEE Transactions on Image Processing, vol. 30, pp. 6785–6800, 2021.
- “Tokenpose: Learning keypoint tokens for human pose estimation,” in Proceedings of the IEEE International conference on computer vision, 2021, pp. 11313–11322.
- “ViTPose: Simple vision transformer baselines for human pose estimation,” in Advances in Neural Information Processing Systems, 2022.
- “Posetrans: A simple yet effective pose transformation augmentation for human pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2022, pp. 643–659.
- “Adversarial semantic data augmentation for human pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2020, pp. 606–622.
- “Multi-scale structure-aware network for human pose estimation,” in Proceedings of the european conference on computer vision (ECCV), 2018, pp. 713–728.
- “Semantic-aware transfer with instance-adaptive parsing for crowded scenes pose estimation,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 686–694.
- “Occlusion-aware siamese network for human pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2020, pp. 396–412.
- “Peeking into occluded joints: A novel framework for crowd pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2020, pp. 488–504.
- “Devil in the details: Towards accurate single and multiple human parsing,” in Proceedings of the AAAI conference on artificial intelligence, 2019, vol. 33, pp. 4814–4821.
- “Multi-hop modulated graph convolutional networks for 3d human pose estimation,” in Proceedings of the British Machine Vision Conference, 2022, pp. 1–13.
- “Simple baselines for human pose estimation and tracking,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 466–481.
- “Deep high-resolution representation learning for human pose estimation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5693–5703.
- “Simcc: A simple coordinate classification perspective for human pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2022, pp. 89–106.
- “Poseur: Direct human pose regression with transformers,” in Proceedings of the European conference on computer vision (ECCV), 2022, pp. 72–88.
- “Crowdpose: Efficient crowded scenes pose estimation and a new benchmark,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 10863–10872.
- “Pose2seg: Detection free human instance segmentation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 889–898.
- “Feature pyramid networks for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117–2125.