Camera Calibration through Geometric Constraints from Rotation and Projection Matrices (2402.08437v2)
Abstract: The process of camera calibration involves estimating the intrinsic and extrinsic parameters, which are essential for accurately performing tasks such as 3D reconstruction, object tracking and augmented reality. In this work, we propose a novel constraints-based loss for measuring the intrinsic (focal length: $(f_x, f_y)$ and principal point: $(p_x, p_y)$) and extrinsic (baseline: ($b$), disparity: ($d$), translation: $(t_x, t_y, t_z)$, and rotation specifically pitch: $(\theta_p)$) camera parameters. Our novel constraints are based on geometric properties inherent in the camera model, including the anatomy of the projection matrix (vanishing points, image of world origin, axis planes) and the orthonormality of the rotation matrix. Thus we proposed a novel Unsupervised Geometric Constraint Loss (UGCL) via a multitask learning framework. Our methodology is a hybrid approach that employs the learning power of a neural network to estimate the desired parameters along with the underlying mathematical properties inherent in the camera projection matrix. This distinctive approach not only enhances the interpretability of the model but also facilitates a more informed learning process. Additionally, we introduce a new CVGL Camera Calibration dataset, featuring over 900 configurations of camera parameters, incorporating 63,600 image pairs that closely mirror real-world conditions. By training and testing on both synthetic and real-world datasets, our proposed approach demonstrates improvements across all parameters when compared to the state-of-the-art (SOTA) benchmarks. The code and the updated dataset can be found here: https://github.com/CVLABLUMS/CVGL-Camera-Calibration
- João P. Barreto. A unifying geometric representation for central projection systems. Computer Vision and Image Understanding, 103(3):208–217, 2006. Special issue on Omnidirectional Vision and Camera Networks.
- Camera calibration through camera projection loss. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2649–2653, 2022.
- Automatic template detection for camera calibration. Research, Society and Development, 2022.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
- Deep image homography estimation. CoRR, abs/1606.03798, 2016.
- CARLA: an open urban driving simulator. CoRR, abs/1711.03938, 2017.
- Olivier Faugeras. Three-dimensional computer vision: a geometric viewpoint. MIT press, 01 1993.
- A new benchmark for stereo-based pedestrian detection. pages 691 – 696, 07 2011.
- Nikhil Ketkar. Introduction to Keras, pages 97–111. Apress, Berkeley, CA, 2017.
- Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
- Deep single image camera calibration with radial distortion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
- Display-camera calibration using eye reflections and geometry constraints. Computer Vision and Image Understanding, 115(6):835–853, 2011.
- Nerftrinsic four: An end-to-end trainable nerf jointly optimizing diverse intrinsic and extrinsic camera parameters. ArXiv, abs/2303.09412, 2023.
- Rethinking the inception architecture for computer vision. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2818–2826, 2016.
- Using Geometric Constraints for Camera Calibration and Positioning and 3D Scene Modelling. In International Workshop on Vision Techniques Applied to the Rehabilitation of City Centres, Lisbon, Portugal, October 2004.
- Deepfocal: A method for direct focal length estimation. 2015 IEEE International Conference on Image Processing (ICIP), pages 1369–1373, 2015.
- Horizon lines in the wild. CoRR, abs/1604.02129, 2016.
- Deepptz: Deep self-calibration for ptz cameras. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1030–1038, 2020.