Multiview-Consistent Semi-Supervised Learning for 3D Human Pose Estimation (1908.05293v3)

Published 14 Aug 2019 in cs.CV

Abstract: The best performing methods for 3D human pose estimation from monocular images require large amounts of in-the-wild 2D and controlled 3D pose annotated datasets which are costly and require sophisticated systems to acquire. To reduce this annotation dependency, we propose Multiview-Consistent Semi Supervised Learning (MCSS) framework that utilizes similarity in pose information from unannotated, uncalibrated but synchronized multi-view videos of human motions as additional weak supervision signal to guide 3D human pose regression. Our framework applies hard-negative mining based on temporal relations in multi-view videos to arrive at a multi-view consistent pose embedding. When jointly trained with limited 3D pose annotations, our approach improves the baseline by 25% and state-of-the-art by 8.7%, whilst using substantially smaller networks. Lastly, but importantly, we demonstrate the advantages of the learned embedding and establish view-invariant pose retrieval benchmarks on two popular, publicly available multi-view human pose datasets, Human 3.6M and MPI-INF-3DHP, to facilitate future research.

Authors (4)

Rahul Mitra (8 papers)
Nitesh B. Gundavarapu (6 papers)
Abhishek Sharma (112 papers)
Arjun Jain (18 papers)

Citations (56)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Multiview-Consistent Semi-Supervised Learning for 3D Human Pose Estimation (1908.05293v3)

Summary

Related Papers