MirrorCalib: Utilizing Human Pose Information for Mirror-based Virtual Camera Calibration (2311.02791v3)
Abstract: In this paper, we present the novel task of estimating the extrinsic parameters of a virtual camera relative to a real camera in exercise videos with a mirror. This task poses a significant challenge in scenarios where the views from the real and mirrored cameras have no overlap or share salient features. To address this issue, prior knowledge of a human body and 2D joint locations are utilized to estimate the camera extrinsic parameters when a person is in front of a mirror. We devise a modified eight-point algorithm to obtain an initial estimation from 2D joint locations. The 2D joint locations are then refined subject to human body constraints. Finally, a RANSAC algorithm is employed to remove outliers by comparing their epipolar distances to a predetermined threshold. MirrorCalib achieves a rotation error of 1.82{\deg} and a translation error of 69.51 mm on a collected real-world dataset, which outperforms the state-of-art method.
- R. Rodrigues, J. P. Barreto, and U. Nunes, “Camera pose estimation using images of Planar Mirror Reflections,” Computer Vision – ECCV 2010, pp. 382–395, 2010.
- K. Takahashi, S. Nobuhara, and T. Matsuyama, “A new mirror-based extrinsic camera calibration using an orthogonality constraint,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2012.
- R. K. Kumar, A. Ilie, J. M. Frahm, and M. Pollefeys, “Simple calibration of non-overlapping cameras with a mirror,” in 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008.
- G. Long, L. Kneip, X. Li, X. Zhang, and Q. Yu, “Simplified mirror-based camera pose computation via rotation averaging,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2015, vol. 07-12-June-2015.
- A. Agrawal and S. Ramalingam, “Single image calibration of multi-axial imaging systems,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2013. doi: 10.1109/CVPR.2013.184.
- S. A. Nene and S. K. Nayar, “Stereo with mirrors,” in Proceedings of the IEEE International Conference on Computer Vision, 1998. doi: 10.1109/iccv.1998.710852.
- J. Gluckman and S. K. Nayar, “Catadioptric stereo using planar mirrors,” Int J Comput Vis, vol. 44, no. 1, 2001, doi: 10.1023/A:1011172403203.
- T. Tahara, R. Kawahara, S. Nobuhara, and T. Matsuyama, “Interference-Free Epipole-Centered Structured Light Pattern for Mirror-Based Multi-view Active Stereo,” in Proceedings - 2015 International Conference on 3D Vision, 3DV 2015, 2015. doi: 10.1109/3DV.2015.25.
- D. Lanman, D. Crispell, and G. Taubin, “Surround structured lighting: 3-D scanning with orthographic illumination,” Computer Vision and Image Understanding, vol. 113, no. 11, 2009, doi: 10.1016/j.cviu.2009.03.016.
- X. Ying, K. Peng, Y. Hou, S. Guan, J. Kong, and H. Zha, “Self-calibration of catadioptric camera with two planar mirrors from silhouettes,” IEEE Trans Pattern Anal Mach Intell, vol. 35, no. 5, 2013, doi: 10.1109/TPAMI.2012.195.
- J. Puwein, L. Ballan, R. Ziegler, and M. Pollefeys, “Joint Camera Pose Estimation and 3D human pose estimation in a multi-camera setup,” Computer Vision – ACCV 2014, pp. 473–487, 2015.
- K. Takahashi, D. Mikami, M. Isogawa, and H. Kimata, “Human pose as calibration pattern: 3D human pose estimation with multiple unsynchronized and uncalibrated cameras,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2018, vol. 2018-June. doi: 10.1109/CVPRW.2018.00230.
- B. Huang, Y. Shu, T. Zhang, and Y. Wang, “Dynamic Multi-Person Mesh Recovery from Uncalibrated Multi-View Cameras,” in Proceedings - 2021 International Conference on 3D Vision, 3DV 2021, 2021. doi: 10.1109/3DV53792.2021.00080.
- G. Ben-Artzi, Y. Kasten, S. Peleg, and M. Werman, “Camera Calibration from Dynamic Silhouettes Using Motion Barcodes,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, vol. 2016-December. doi: 10.1109/CVPR.2016.444.
- S. N. Sinha and M. Pollefeys, “Camera network calibration and synchronization from silhouettes in archived video,” Int J Comput Vis, vol. 87, no. 3, 2010, doi: 10.1007/s11263-009-0269-2.
- G. L. Mariottini, S. Scheggi, F. Morbidi, and D. Prattichizzo, “Planar mirrors for image-based robot localization and 3-D reconstruction,” Mechatronics, vol. 22, no. 4, 2012
- R. I. Hartley, “In defence of the 8-point algorithm,” in IEEE International Conference on Computer Vision, 1995.
- M. Loper, N. Mahmood, J. Romero, G. Pons-Moll, and M. J. Black, “SMPL: A skinned multi-person linear model,” in ACM Transactions on Graphics, 2015, vol. 34, no. 6.
- C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu, “Human3. 6M,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014.
- K. M. Robinette, H. Daanen, and E. Paquet, “The CAESAR project: A 3-D surface anthropometry survey,” in Proceedings - 2nd International Conference on 3-D Digital Imaging and Modeling, 3DIM 1999, 1999. doi: 10.1109/IM.1999.805368.
- R. Y. Tsai and T. S. Huang, “Uniqueness and Estimation of Three-Dimensional Motion Parameters of Rigid Objects with Curved Surfaces,” IEEE Trans Pattern Anal Mach Intell, vol. PAMI-6, no. 1, 1984, doi: 10.1109/TPAMI.1984.4767471.
- Q. Fang, Q. Shuai, J. Dong, H. Bao, and X. Zhou, “Reconstructing 3D human pose by watching humans in the mirror,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2021. doi: 10.1109/CVPR46437.2021.01262.
- G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
- Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Realtime multi-person 2D pose estimation using part affinity fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 1, pp. 172–186, 2021.
- P. Sturm and T. Bonfort, “How to compute the pose of an object without a direct view?,” Computer Vision – ACCV 2006, pp. 21–31, 2006.
- K. H. Jang, Y. M. Cha, and S. K. Jung, “3D reconstruction using moving planar mirror,” in IASTED International Conference on Computer Graphics and Imaging, 2003.
- M. Kanbara, N. Ukita, M. Kidode, and N. Yokoya, “3D scene reconstruction from reflection images in a spherical mirror,” in Proceedings - International Conference on Pattern Recognition, 2006. doi: 10.1109/ICPR.2006.32.
- H. Zhong, W. F. Sze, and Y. S. Hung, “Reconstruction from plane mirror reflection,” in Proceedings - International Conference on Pattern Recognition, 2006. doi: 10.1109/ICPR.2006.981.
- B. Hu, “It’s all done with mirrors: calibration- and- correspondence- free 3D reconstruction,” in Proceedings of the 2009 Canadian Conference on Computer and Robot Vision, CRV 2009, 2009. doi: 10.1109/CRV.2009.29.
- Y. Zhang, C. Wang, X. Wang, W. Liu, and W. Zeng, “VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the Wild,” IEEE Trans Pattern Anal Mach Intell, vol. 45, no. 2, 2023, doi: 10.1109/TPAMI.2022.3163709.
- C. Lassner, J. Romero, M. Kiefel, F. Bogo, M. J. Black, and P. V. Gehler, “Unite the people: Closing the loop between 3D and 2D human representations,” in Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017. doi: 10.1109/CVPR.2017.500.
- M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, “2D human pose estimation: New benchmark and state of the art analysis,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2014. doi: 10.1109/CVPR.2014.471.
- M. Fieraru, M. Zanfir, S. C. Pirlea, V. Olaru, and C. Sminchisescu, “AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2021. doi: 10.1109/CVPR46437.2021.00979.
- C. Sminchisescu, “3D Human motion analysis in monocular video techniques and challenges,” in Proceedings - IEEE International Conference on Video and Signal Based Surveillance 2006, AVSS 2006, 2006. doi: 10.1109/AVSS.2006.3.
- Y. Goutsu and T. Inamura, “Linguistic Descriptions of Human Motion with Generative Adversarial Seq2Seq Learning,” in Proceedings - IEEE International Conference on Robotics and Automation, 2021. doi: 10.1109/ICRA48506.2021.9561519.
- W. Takano and Y. Nakamura, “Statistical mutual conversion between whole body motion primitives and linguistic sentences for human motions,” International Journal of Robotics Research, vol. 34, no. 10, 2015, doi: 10.1177/0278364915587923.
- S. Beckouche, S. Leprince, N. Sabater, and F. Ayoub, “Robust outliers detection in image point matching,” in Proceedings of the IEEE International Conference on Computer Vision, 2011. doi: 10.1109/ICCVW.2011.6130241.