Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Representing 3D sparse map points and lines for camera relocalization (2402.18011v1)

Published 28 Feb 2024 in cs.CV

Abstract: Recent advancements in visual localization and mapping have demonstrated considerable success in integrating point and line features. However, expanding the localization framework to include additional mapping components frequently results in increased demand for memory and computational resources dedicated to matching tasks. In this study, we show how a lightweight neural network can learn to represent both 3D point and line features, and exhibit leading pose accuracy by harnessing the power of multiple learned mappings. Specifically, we utilize a single transformer block to encode line features, effectively transforming them into distinctive point-like descriptors. Subsequently, we treat these point and line descriptor sets as distinct yet interconnected feature sets. Through the integration of self- and cross-attention within several graph layers, our method effectively refines each feature before regressing 3D maps using two simple MLPs. In comprehensive experiments, our indoor localization findings surpass those of Hloc and Limap across both point-based and line-assisted configurations. Moreover, in outdoor scenarios, our method secures a significant lead, marking the most considerable enhancement over state-of-the-art learning-based methodologies. The source code and demo videos of this work are publicly available at: https://thpjp.github.io/pl2map/

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Airvo: An illumination-robust point-line visual odometry. In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3429–3436. IEEE, 2023.
  2. 3d line mapping revisited. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21445–21455, 2023.
  3. Pl-cvio: Point-line cooperative visual-inertial odometry. In 2023 IEEE Conference on Control Technology and Applications (CCTA), pages 859–865. IEEE, 2023.
  4. Pl-slam: A stereo slam system through the combination of points and line segments. IEEE Transactions on Robotics, 35(3):734–746, 2019.
  5. Structure plp-slam: Efficient sparse mapping and localization using point, line and plane for monocular, rgb-d and stereo cameras. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 2105–2112. IEEE, 2023.
  6. From coarse to fine: Robust hierarchical localization at large scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12716–12725, 2019.
  7. Superglue: Learning feature matching with graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4938–4947, 2020.
  8. Line as a visual sentence: context-aware line descriptor for visual localization. IEEE Robotics and Automation Letters, 6(4):8726–8733, 2021.
  9. Sold2: Self-supervised occlusion-aware line description and detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11368–11378, 2021.
  10. Visual camera re-localization from rgb and rgb-d images using dsac. IEEE transactions on pattern analysis and machine intelligence, 44(9):5847–5865, 2021.
  11. Accelerated coordinate encoding: Learning to relocalize in minutes using rgb and poses. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5044–5053, 2023.
  12. Sanet: Scene agnostic network for camera localization. In Proceedings of the IEEE/CVF international conference on computer vision, pages 42–51, 2019.
  13. Kfnet: Learning temporal camera relocalization using kalman filtering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4919–4928, 2020.
  14. Lightglue: Local feature matching at light speed. arXiv preprint arXiv:2306.13643, 2023.
  15. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  16. Scene coordinate regression forests for camera relocalization in rgb-d images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2930–2937, 2013.
  17. Posenet: A convolutional network for real-time 6-dof camera relocalization. In Proceedings of the IEEE international conference on computer vision, pages 2938–2946, 2015.
  18. Netvlad: Cnn architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5297–5307, 2016.
  19. Understanding the limitations of cnn-based absolute camera pose regression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3302–3312, 2019.
  20. Featloc: Absolute pose regressor for indoor 2d sparse features with simplistic view synthesizing. ISPRS Journal of Photogrammetry and Remote Sensing, 189:50–62, 2022.
  21. To learn or not to learn: Visual localization from essential matrices. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 3319–3326. IEEE, 2020.
  22. City-scale localization for cameras with known vertical direction. IEEE transactions on pattern analysis and machine intelligence, 39(7):1455–1461, 2016.
  23. Camera pose voting for large-scale image-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pages 2704–2712, 2015.
  24. Pose refinement with joint optimization of visual points and lines. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2888–2894. IEEE, 2022.
  25. Gluestick: Robust image matching by sticking points and lines together. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9706–9716, 2023.
  26. Fast and lightweight scene regressor for camera relocalization. arXiv preprint arXiv:2212.01830, 2022.
  27. Privacy preserving image-based localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5493–5503, 2019.
  28. D2s: Representing local descriptors and global scene coordinates for camera relocalization. arXiv preprint arXiv:2307.15250, 2023.
  29. Learning to detect scene landmarks for camera localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11132–11142, 2022.
  30. Focustune: Tuning visual localization through focus-guided sampling. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3606–3615, 2024.
  31. Visual slam with graph-cut optimized multi-plane reconstruction. In 2021 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), pages 165–170. IEEE, 2021.
  32. An object slam framework for association, mapping, and high-level tasks. IEEE Transactions on Robotics, 2023.
  33. So-slam: Semantic object slam with scale proportional and symmetrical texture constraints. IEEE Robotics and Automation Letters, 7(2):4008–4015, 2022.
  34. David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60:91–110, 2004.
  35. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 224–236, 2018.
  36. Lsd: A fast line segment detector with a false detection control. IEEE transactions on pattern analysis and machine intelligence, 32(4):722–732, 2008.
  37. Deeplsd: Line segment detection and refinement with deep image gradients. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17327–17336, 2023.
  38. Automatic differentiation in pytorch. 2017.
  39. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  40. Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In Proceedings of the 24th annual ACM symposium on User interface software and technology, pages 559–568, 2011.
Citations (1)

Summary

We haven't generated a summary for this paper yet.