Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries (2311.17389v3)

Published 29 Nov 2023 in cs.CV

Abstract: Portable 360$\circ$ cameras are becoming a cheap and efficient tool to establish large visual databases. By capturing omnidirectional views of a scene, these cameras could expedite building environment models that are essential for visual localization. However, such an advantage is often overlooked due to the lack of valuable datasets. This paper introduces a new benchmark dataset, 360Loc, composed of 360$\circ$ images with ground truth poses for visual localization. We present a practical implementation of 360$\circ$ mapping combining 360$\circ$ images with lidar data to generate the ground truth 6DoF poses. 360Loc is the first dataset and benchmark that explores the challenge of cross-device visual positioning, involving 360$\circ$ reference frames, and query frames from pinhole, ultra-wide FoV fisheye, and 360$\circ$ cameras. We propose a virtual camera approach to generate lower-FoV query frames from 360$\circ$ images, which ensures a fair comparison of performance among different query types in visual localization tasks. We also extend this virtual camera approach to feature matching-based and pose regression-based methods to alleviate the performance loss caused by the cross-device domain gap, and evaluate its effectiveness against state-of-the-art baselines. We demonstrate that omnidirectional visual localization is more robust in challenging large-scale scenes with symmetries and repetitive structures. These results provide new insights into 360-camera mapping and omnidirectional visual localization with cross-device queries.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Netvlad: Cnn architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5297–5307, 2016.
  2. Joint 2d-3d-semantic data for indoor scene understanding. arXiv preprint arXiv:1702.01105, 2017.
  3. Rethinking visual geo-localization for large-scale applications. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4878–4888, 2022.
  4. Extending absolute pose regression to multiple scenes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.
  5. Learning less is more - 6D camera localization via 3D surface regression. In CVPR, 2018.
  6. Visual camera re-localization from RGB and RGB-D images using DSAC. TPAMI, 2021.
  7. DSAC-Differentiable RANSAC for camera localization. In CVPR, 2017.
  8. Geometry-aware learning of maps for camera localization. In IEEE conference on computer vision and pattern recognition, 2018.
  9. 6d camera relocalization in ambiguous scenes via continuous multimodal inference. 2020.
  10. University of michigan north campus long-term vision and lidar dataset. The International Journal of Robotics Research, 35(9):1023–1035, 2016.
  11. City-scale landmark identification on mobile devices. In CVPR 2011, pages 737–744. IEEE, 2011.
  12. Direct-posenet: absolute pose regression with photometric consistency. In 2021 International Conference on 3D Vision (3DV), pages 1175–1185. IEEE, 2021.
  13. Dfnet: Enhance absolute pose regression with direct feature matching. In ECCV 2022. Tel Aviv, Israel, October 23–27, 2022, Part X. Springer, 2022.
  14. Advio: An authentic dataset for visual-inertial odometry. In Proceedings of the European Conference on Computer Vision (ECCV), pages 419–434, 2018.
  15. trimesh.
  16. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 224–236, 2018.
  17. D2-net: A trainable cnn for joint description and detection of local features. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition, pages 8092–8101, 2019.
  18. Self-supervising fine-grained region similarities for large-scale image localization. In European Conference on Computer Vision, 2020.
  19. Daniel Girardeau-Montaut. Cloudcompare. France: EDF R&D Telecom ParisTech, 11, 2016.
  20. End-to-end learning of deep visual representations for image retrieval. IJCV, 2017.
  21. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  22. 360vot: A new benchmark dataset for omnidirectional visual object tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 20566–20576, 2023.
  23. Geometric loss functions for camera pose regression with deep learning. In IEEE conference on computer vision and pattern recognition, pages 5974–5983, 2017.
  24. Posenet: A convolutional network for real-time 6-dof camera relocalization. In Proceedings of the IEEE international conference on computer vision, pages 2938–2946, 2015.
  25. Piccolo: point cloud-centric omnidirectional localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3313–3323, 2021.
  26. Cpo: Change robust panorama to point cloud localization. In European Conference on Computer Vision, pages 176–192. Springer, 2022.
  27. Calibrating panoramic depth estimation for practical localization and mapping. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8830–8840, 2023.
  28. Opengv: A unified and generalized approach to real-time calibrated geometric vision. In 2014 IEEE international conference on robotics and automation (ICRA), pages 1–8. IEEE, 2014.
  29. A portable three-dimensional lidar-based system for long-term and wide-area people behavior measurement. International Journal of Advanced Robotic Systems, 16(2):1729881419841532, 2019.
  30. General, single-shot, target-less, and automatic lidar-camera extrinsic calibration toolbox. arXiv preprint arXiv:2302.05094, 2023.
  31. Large-scale localization datasets in crowded indoor spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3227–3236, 2021.
  32. LightGlue: Local Feature Matching at Light Speed. In ICCV, 2023.
  33. A low-cost and scalable framework to build large-scale localization benchmark for augmented reality. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
  34. Efficient global 2d-3d matching for camera localization in a large-scale 3d map. In Proceedings of the IEEE International Conference on Computer Vision, pages 2372–2381, 2017.
  35. Balm: Bundle adjustment for lidar mapping. IEEE Robotics and Automation Letters, 6(2):3184–3191, 2021.
  36. David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60:91–110, 2004.
  37. Image-based localization using hourglass networks. In IEEE international conference on computer vision workshops, 2017.
  38. Coordinet: uncertainty-aware pose regressor for reliable vehicle localization. In IEEE/CVF Winter Conference on Applications of Computer Vision, 2022.
  39. Pose estimation for two-view panoramas based on keypoint matching: A comparative study and critical analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5202–5211, 2022.
  40. Ken Museth. Vdb: High-resolution sparse volumes with dynamic topology. ACM transactions on graphics (TOG), 32(3):1–22, 2013.
  41. Reassessing the limitations of cnn methods for camera pose regression. arXiv preprint arXiv:2108.07260, 2021.
  42. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  43. From coarse to fine: Robust hierarchical localization at large scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12716–12725, 2019.
  44. Superglue: Learning feature matching with graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4938–4947, 2020.
  45. Lamar: Benchmarking localization and mapping for augmented reality. In European Conference on Computer Vision, pages 686–704. Springer, 2022.
  46. Efficient & effective prioritized matching for large-scale image-based localization. IEEE transactions on pattern analysis and machine intelligence, 39(9):1744–1756, 2016.
  47. Benchmarking 6dof outdoor visual localization in changing conditions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8601–8610, 2018.
  48. Understanding the limitations of cnn-based absolute camera pose regression. In IEEE/CVF conference on computer vision and pattern recognition, 2019.
  49. Structure-from-motion revisited. In IEEE conference on computer vision and pattern recognition, pages 4104–4113, 2016.
  50. A multi-view stereo benchmark with high-resolution images and multi-camera videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3260–3269, 2017.
  51. Learning multi-scene absolute pose regression with transformers. In IEEE/CVF International Conference on Computer Vision, pages 2733–2742, 2021.
  52. Scene coordinate regression forests for camera relocalization in rgb-d images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2930–2937, 2013.
  53. Alexander D Stewart. Localisation using the appearance of prior structure. PhD thesis, University of Oxford, 2014.
  54. Loftr: Detector-free local feature matching with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8922–8931, 2021.
  55. A dataset for benchmarking image-based localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7436–7444, 2017.
  56. Inloc: Indoor visual localization with dense matching and view synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7199–7209, 2018.
  57. Disk: Learning local features with policy gradient. Advances in Neural Information Processing Systems, 33:14254–14265, 2020.
  58. The double sphere camera model. In 2018 International Conference on 3D Vision (3DV), pages 552–560. IEEE, 2018.
  59. Vdbfusion: Flexible and efficient tsdf integration of range sensor data. Sensors, 22(3):1296, 2022.
  60. Beyond controlled environments: 3d camera re-localization in changing indoor scenes. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, pages 467–487. Springer, 2020.
  61. Delving deeper into convolutional neural networks for camera relocalization. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 5644–5651. IEEE, 2017.
  62. Pandora: A panoramic detection dataset for object with orientation. In ECCV, 2022.
  63. Long-term visual localization with mobile sensors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17245–17255, 2023.
  64. Grid based spherical cnn for object detection from panoramic images. Sensors, 19(11):2622, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Huajian Huang (12 papers)
  2. Changkun Liu (9 papers)
  3. Yipeng Zhu (3 papers)
  4. Hui Cheng (40 papers)
  5. Tristan Braud (34 papers)
  6. Sai-Kit Yeung (52 papers)
Citations (3)