Global Localization: Utilizing Relative Spatio-Temporal Geometric Constraints from Adjacent and Distant Cameras (2312.00500v1)
Abstract: Re-localizing a camera from a single image in a previously mapped area is vital for many computer vision applications in robotics and augmented/virtual reality. In this work, we address the problem of estimating the 6 DoF camera pose relative to a global frame from a single image. We propose to leverage a novel network of relative spatial and temporal geometric constraints to guide the training of a Deep Network for localization. We employ simultaneously spatial and temporal relative pose constraints that are obtained not only from adjacent camera frames but also from camera frames that are distant in the spatio-temporal space of the scene. We show that our method, through these constraints, is capable of learning to localize when little or very sparse ground-truth 3D coordinates are available. In our experiments, this is less than 1% of available ground-truth data. We evaluate our method on 3 common visual localization datasets and show that it outperforms other direct pose estimation methods.
- A. Kendall, M. Grimes, and R. Cipolla, “PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization,” ICCV, pp. 2938–2946, 2015.
- A. Kendall and R. Cipolla, “Geometric Loss Functions for Camera Pose Regression with Deep Learning,” CVPR, pp. 6555–6564, 2017.
- F. Walch, C. Hazirbas, L. Leal-Taixé, T. Sattler, S. Hilsenbeck, and D. Cremers, “Image-Based Localization Using LSTMs for Structured Feature Correlation,” ICCV, pp. 627–637, 2017.
- B. Wang, C. Chen, C. X. Lu, P. Zhao, A. Trigoni, and A. Markham, “AtLoc: Attention Guided Camera Localization,” in AAAI, 2020.
- R. Clark, S. Wang, A. Markham, A. Trigoni, and H. Wen, “VidLoc: A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization,” CVPR, pp. 2652–2660, 2017.
- S. Brahmbhatt, J. Gu, K. Kim, J. Hays, and J. Kautz, “Geometry-Aware Learning of Maps for Camera Localization,” in CVPR, 2018.
- A. Valada, N. Radwan, and W. Burgard, “Deep Auxiliary Learning for Visual Localization and Odometry,” in ICRA, 2018.
- F. Ott, T. Feigl, C. Loffler, and C. Mutschler, “ViPR: Visual-Odometry-aided Pose Regression for 6DoF Camera Localization,” in CVPRW, 2020, pp. 187–198.
- T. Sattler, Q. Zhou, M. Pollefeys, and L. Leal-Taixe, “Understanding the limitations of cnn-based absolute camera pose regression,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3297–3307.
- W. Kabsch, “A Solution for the Best Rotation to Relate Two Sets of Vectors,” Acta Crystallographica Section A, vol. 32, no. 5, pp. 922–923, Sep 1976.
- M. Fischler and R. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Commun. ACM, vol. 24, pp. 381–395, 1981.
- T. Sattler, B. Leibe, and L. Kobbelt, “Efficient & Effective Prioritized Matching for Large-Scale Image-Based Localization,” PAMI, vol. 39, no. 9, pp. 1744–1756, 2017.
- E. Brachmann and C. Rother, “Learning Less is More - 6D Camera Localization Via 3D Surface Regression,” in CVPR, 2018.
- ——, “Visual Camera Re-Localization from RGB and RGB-D Images Using DSAC,” PAMI, no. 01, pp. 1–1, apr 5555.
- M. Altillawi, “PixSelect: Less but Reliable Pixels for Accurate and Efficient Localization,” in ICRA, 2022, pp. 4156–4162.
- P.-E. Sarlin, A. Unagar, M. Larsson, H. Germain, C. Toft, V. Larsson, M. Pollefeys, V. Lepetit, L. Hammarstrand, F. Kahl, and T. Sattler, “Back to the Feature: Learning Robust Camera Localization from Pixels to Pose,” in CVPR, 2021.
- H. Blanton, S. Workman, and N. Jacobs, “A Structure-Aware Method for Direct Pose Estimation,” in WACV, 2022, pp. 2019–2028.
- F. Xue, X. Wu, S. Cai, and J. Wang, “Learning Multi-View Camera Relocalization With Graph Neural Networks,” in CVPR, 2020, pp. 11 372–11 381.
- F. Xue, X. Wang, Z. Yan, Q. Wang, J. Wang, and H. Zha, “Local Supports Global: Deep Camera Relocalization with Sequence Enhancement,” in ICCV, 2019, pp. 2841–2850.
- J. Shotton, B. Glocker, C. Zach, S. Izadi, A. Criminisi, and A. Fitzgibbon, “Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images,” CVPR, pp. 2930–2937, 2013.
- J. Valentin, A. Dai, M. Nießner, P. Kohli, P. Torr, S. Izadi, and C. Keskin, “Learning to Navigate the Energy Landscape,” in 3DV, 2016, pp. 323–332.
- T. Naseer and W. Burgard, “Deep Regression for Monocular Camera-Based 6-DOF Global Localization in Outdoor Environments,” IROS, pp. 1525–1530, 2017.
- M. Cai, C. Shen, and I. D. Reid, “A Hybrid Probabilistic Model for Camera Relocalization,” in BMVC, 2018.
- E. Brachmann, “7scenes_rendered_depth.tar.gz,” in DSAC* Visual Re-Localization [Data]. heiDATA, 2020.
- ——, “12scenes_rendered_depth.tar.gz,” in DSAC* Visual Re-Localization [Data]. heiDATA, 2020.
- V. Nair and G. E. Hinton, “Rectified Linear Units Improve Restricted Boltzmann Machines,” in ICML, 2010, pp. 807–814.
- X.-S. Gao, X.-R. Hou, J. Tang, and H.-F. Cheng, “Complete Solution Classification for the Perspective-Three-Point Problem,” PAMI, vol. 25, no. 8, pp. 930–943, 2003.
- G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
- I. Melekhov, J. Ylioinas, J. Kannala, and E. Rahtu, “Image-Based Localization Using Hourglass Networks,” ICCVW, pp. 870–877, 2017.
- J. Wu, L. Ma, and X. Hu, “Delving Deeper into Convolutional Neural Networks for Camera Relocalization,” in ICRA, 2017, pp. 5644–5651.