360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries (2311.17389v3)
Abstract: Portable 360$\circ$ cameras are becoming a cheap and efficient tool to establish large visual databases. By capturing omnidirectional views of a scene, these cameras could expedite building environment models that are essential for visual localization. However, such an advantage is often overlooked due to the lack of valuable datasets. This paper introduces a new benchmark dataset, 360Loc, composed of 360$\circ$ images with ground truth poses for visual localization. We present a practical implementation of 360$\circ$ mapping combining 360$\circ$ images with lidar data to generate the ground truth 6DoF poses. 360Loc is the first dataset and benchmark that explores the challenge of cross-device visual positioning, involving 360$\circ$ reference frames, and query frames from pinhole, ultra-wide FoV fisheye, and 360$\circ$ cameras. We propose a virtual camera approach to generate lower-FoV query frames from 360$\circ$ images, which ensures a fair comparison of performance among different query types in visual localization tasks. We also extend this virtual camera approach to feature matching-based and pose regression-based methods to alleviate the performance loss caused by the cross-device domain gap, and evaluate its effectiveness against state-of-the-art baselines. We demonstrate that omnidirectional visual localization is more robust in challenging large-scale scenes with symmetries and repetitive structures. These results provide new insights into 360-camera mapping and omnidirectional visual localization with cross-device queries.
- Netvlad: Cnn architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5297–5307, 2016.
- Joint 2d-3d-semantic data for indoor scene understanding. arXiv preprint arXiv:1702.01105, 2017.
- Rethinking visual geo-localization for large-scale applications. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4878–4888, 2022.
- Extending absolute pose regression to multiple scenes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.
- Learning less is more - 6D camera localization via 3D surface regression. In CVPR, 2018.
- Visual camera re-localization from RGB and RGB-D images using DSAC. TPAMI, 2021.
- DSAC-Differentiable RANSAC for camera localization. In CVPR, 2017.
- Geometry-aware learning of maps for camera localization. In IEEE conference on computer vision and pattern recognition, 2018.
- 6d camera relocalization in ambiguous scenes via continuous multimodal inference. 2020.
- University of michigan north campus long-term vision and lidar dataset. The International Journal of Robotics Research, 35(9):1023–1035, 2016.
- City-scale landmark identification on mobile devices. In CVPR 2011, pages 737–744. IEEE, 2011.
- Direct-posenet: absolute pose regression with photometric consistency. In 2021 International Conference on 3D Vision (3DV), pages 1175–1185. IEEE, 2021.
- Dfnet: Enhance absolute pose regression with direct feature matching. In ECCV 2022. Tel Aviv, Israel, October 23–27, 2022, Part X. Springer, 2022.
- Advio: An authentic dataset for visual-inertial odometry. In Proceedings of the European Conference on Computer Vision (ECCV), pages 419–434, 2018.
- trimesh.
- Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 224–236, 2018.
- D2-net: A trainable cnn for joint description and detection of local features. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition, pages 8092–8101, 2019.
- Self-supervising fine-grained region similarities for large-scale image localization. In European Conference on Computer Vision, 2020.
- Daniel Girardeau-Montaut. Cloudcompare. France: EDF R&D Telecom ParisTech, 11, 2016.
- End-to-end learning of deep visual representations for image retrieval. IJCV, 2017.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- 360vot: A new benchmark dataset for omnidirectional visual object tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 20566–20576, 2023.
- Geometric loss functions for camera pose regression with deep learning. In IEEE conference on computer vision and pattern recognition, pages 5974–5983, 2017.
- Posenet: A convolutional network for real-time 6-dof camera relocalization. In Proceedings of the IEEE international conference on computer vision, pages 2938–2946, 2015.
- Piccolo: point cloud-centric omnidirectional localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3313–3323, 2021.
- Cpo: Change robust panorama to point cloud localization. In European Conference on Computer Vision, pages 176–192. Springer, 2022.
- Calibrating panoramic depth estimation for practical localization and mapping. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8830–8840, 2023.
- Opengv: A unified and generalized approach to real-time calibrated geometric vision. In 2014 IEEE international conference on robotics and automation (ICRA), pages 1–8. IEEE, 2014.
- A portable three-dimensional lidar-based system for long-term and wide-area people behavior measurement. International Journal of Advanced Robotic Systems, 16(2):1729881419841532, 2019.
- General, single-shot, target-less, and automatic lidar-camera extrinsic calibration toolbox. arXiv preprint arXiv:2302.05094, 2023.
- Large-scale localization datasets in crowded indoor spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3227–3236, 2021.
- LightGlue: Local Feature Matching at Light Speed. In ICCV, 2023.
- A low-cost and scalable framework to build large-scale localization benchmark for augmented reality. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
- Efficient global 2d-3d matching for camera localization in a large-scale 3d map. In Proceedings of the IEEE International Conference on Computer Vision, pages 2372–2381, 2017.
- Balm: Bundle adjustment for lidar mapping. IEEE Robotics and Automation Letters, 6(2):3184–3191, 2021.
- David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60:91–110, 2004.
- Image-based localization using hourglass networks. In IEEE international conference on computer vision workshops, 2017.
- Coordinet: uncertainty-aware pose regressor for reliable vehicle localization. In IEEE/CVF Winter Conference on Applications of Computer Vision, 2022.
- Pose estimation for two-view panoramas based on keypoint matching: A comparative study and critical analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5202–5211, 2022.
- Ken Museth. Vdb: High-resolution sparse volumes with dynamic topology. ACM transactions on graphics (TOG), 32(3):1–22, 2013.
- Reassessing the limitations of cnn methods for camera pose regression. arXiv preprint arXiv:2108.07260, 2021.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- From coarse to fine: Robust hierarchical localization at large scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12716–12725, 2019.
- Superglue: Learning feature matching with graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4938–4947, 2020.
- Lamar: Benchmarking localization and mapping for augmented reality. In European Conference on Computer Vision, pages 686–704. Springer, 2022.
- Efficient & effective prioritized matching for large-scale image-based localization. IEEE transactions on pattern analysis and machine intelligence, 39(9):1744–1756, 2016.
- Benchmarking 6dof outdoor visual localization in changing conditions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8601–8610, 2018.
- Understanding the limitations of cnn-based absolute camera pose regression. In IEEE/CVF conference on computer vision and pattern recognition, 2019.
- Structure-from-motion revisited. In IEEE conference on computer vision and pattern recognition, pages 4104–4113, 2016.
- A multi-view stereo benchmark with high-resolution images and multi-camera videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3260–3269, 2017.
- Learning multi-scene absolute pose regression with transformers. In IEEE/CVF International Conference on Computer Vision, pages 2733–2742, 2021.
- Scene coordinate regression forests for camera relocalization in rgb-d images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2930–2937, 2013.
- Alexander D Stewart. Localisation using the appearance of prior structure. PhD thesis, University of Oxford, 2014.
- Loftr: Detector-free local feature matching with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8922–8931, 2021.
- A dataset for benchmarking image-based localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7436–7444, 2017.
- Inloc: Indoor visual localization with dense matching and view synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7199–7209, 2018.
- Disk: Learning local features with policy gradient. Advances in Neural Information Processing Systems, 33:14254–14265, 2020.
- The double sphere camera model. In 2018 International Conference on 3D Vision (3DV), pages 552–560. IEEE, 2018.
- Vdbfusion: Flexible and efficient tsdf integration of range sensor data. Sensors, 22(3):1296, 2022.
- Beyond controlled environments: 3d camera re-localization in changing indoor scenes. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, pages 467–487. Springer, 2020.
- Delving deeper into convolutional neural networks for camera relocalization. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 5644–5651. IEEE, 2017.
- Pandora: A panoramic detection dataset for object with orientation. In ECCV, 2022.
- Long-term visual localization with mobile sensors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17245–17255, 2023.
- Grid based spherical cnn for object detection from panoramic images. Sensors, 19(11):2622, 2019.
- Huajian Huang (12 papers)
- Changkun Liu (9 papers)
- Yipeng Zhu (3 papers)
- Hui Cheng (40 papers)
- Tristan Braud (34 papers)
- Sai-Kit Yeung (52 papers)