Learning Neural Volumetric Pose Features for Camera Localization (2403.12800v4)
Abstract: We introduce a novel neural volumetric pose feature, termed PoseMap, designed to enhance camera localization by encapsulating the information between images and the associated camera poses. Our framework leverages an Absolute Pose Regression (APR) architecture, together with an augmented NeRF module. This integration not only facilitates the generation of novel views to enrich the training dataset but also enables the learning of effective pose features. Additionally, we extend our architecture for self-supervised online alignment, allowing our method to be used and fine-tuned for unlabelled images within a unified framework. Experiments demonstrate that our method achieves 14.28% and 20.51% performance gain on average in indoor and outdoor benchmark scenes, outperforming existing APR methods with state-of-the-art accuracy.
- Relocnet: Continuous metric learning relocalisation using neural nets. In Proceedings of the European Conference on Computer Vision (ECCV), pages 751–767, 2018.
- Depth camera based indoor mobile robot localization and navigation. In 2012 IEEE International Conference on Robotics and Automation, pages 1697–1702. IEEE, 2012.
- Extending absolute pose regression to multiple scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 38–39, 2020.
- Geometry-aware learning of maps for camera localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2616–2625, 2018.
- A hybrid probabilistic model for camera relocalization. 2019.
- Direct-posenet: Absolute pose regression with photometric consistency. In 2021 International Conference on 3D Vision (3DV), pages 1175–1185. IEEE, 2021.
- Dfnet: Enhance absolute pose regression with direct feature matching. In ECCV, pages 1–17. Springer, 2022.
- Refinement for absolute pose regression with neural feature synthesis. ArXiv, abs/2303.10087, 2023.
- Neat: Neural attention fields for end-to-end autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15793–15803, 2021.
- Camnet: Coarse-to-fine retrieval for camera re-localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2871–2880, 2019.
- PDF: Point diffusion implicit function for large-scale scene neural representation. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Lazy visual localization via motion averaging. ArXiv, abs/2307.09981, 2023.
- Colmap-free 3d gaussian splatting. ArXiv, abs/2312.07504, 2023.
- Multi-view stereo: A tutorial. Foundations and Trends® in Computer Graphics and Vision, 9(1-2):1–148, 2015.
- Modelling uncertainty in deep learning for camera relocalization. In 2016 IEEE international conference on Robotics and Automation (ICRA), pages 4762–4769. IEEE, 2016.
- Geometric loss functions for camera pose regression with deep learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5974–5983, 2017.
- Posenet: A convolutional network for real-time 6-dof camera relocalization. In Proceedings of the IEEE international conference on computer vision, pages 2938–2946, 2015.
- Panoptic neural fields: A semantic object-aware neural scene representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12871–12881, 2022.
- Barf: Bundle-adjusting neural radiance fields. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 5721–5731, 2021.
- Hr-apr: Apr-agnostic framework with uncertainty estimation and hierarchical refinement for camera relocalisation. 2024.
- Efficient global 2d-3d matching for camera localization in a large-scale 3d map. In Proceedings of the IEEE International Conference on Computer Vision, pages 2372–2381, 2017.
- NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. In CVPR, 2021.
- Image-based localization using hourglass networks. In Proceedings of the IEEE international conference on computer vision workshops, pages 879–886, 2017.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, pages 405–421. Springer, 2020.
- Lens: Localization enhanced by nerf synthesis. In Conf. on Robot Learning, pages 1347–1356. PMLR, 2022.
- Crossfire: Camera relocalization on self-supervised features from an implicit representation. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 252–262, 2023.
- Orb-slam: A versatile and accurate monocular slam system. IEEE Transactions on Robotics, 31(5):1147–1163, 2015a.
- Orb-slam: a versatile and accurate monocular slam system. IEEE transactions on robotics, 31(5):1147–1163, 2015b.
- Deep regression for monocular camera-based 6-dof global localization in outdoor environments. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1525–1530. IEEE, 2017.
- Kinectfusion: Real-time dense surface mapping and tracking. In 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pages 127–136, 2011.
- Vlocnet++: Deep multitask learning for semantic visual localization and odometry. IEEE Robotics and Automation Letters, 3(4):4407–4414, 2018.
- From coarse to fine: Robust hierarchical localization at large scale. In CVPR, 2019.
- Efficient & effective prioritized matching for large-scale image-based localization. IEEE transactions on pattern analysis and machine intelligence, 39(9):1744–1756, 2016.
- Understanding the limitations of cnn-based absolute camera pose regression. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Camera pose auto-encoders for improving pose regression. In European Conference on Computer Vision, pages 140–157. Springer, 2022.
- Learning multi-scene absolute pose regression with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2733–2742, 2021a.
- Paying attention to activation maps in camera pose regression. arXiv preprint arXiv:2103.11477, 2021b.
- Scene coordinate regression forests for camera relocalization in rgb-d images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2930–2937, 2013.
- City-scale localization for cameras with known vertical direction. IEEE transactions on pattern analysis and machine intelligence, 39(7):1455–1461, 2016.
- Inloc: Indoor visual localization with dense matching and view synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7199–7209, 2018.
- Semantic match consistency for long-term visual localization. In Proceedings of the European Conference on Computer Vision (ECCV), pages 383–399, 2018.
- Deep auxiliary learning for visual localization and odometry. In 2018 IEEE international conference on robotics and automation (ICRA), pages 6939–6946. IEEE, 2018.
- Image-based localization using lstms for structured feature correlation. In Proceedings of the IEEE international conference on computer vision, pages 627–637, 2017.
- Delving deeper into convolutional neural networks for camera relocalization. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 5644–5651. IEEE, 2017.