PRISM-TopoMap: Online Topological Mapping with Place Recognition and Scan Matching (2404.01674v4)
Abstract: Mapping is one of the crucial tasks enabling autonomous navigation of a mobile robot. Conventional mapping methods output a dense geometric map representation, e.g. an occupancy grid, which is not trivial to keep consistent for prolonged runs covering large environments. Meanwhile, capturing the topological structure of the workspace enables fast path planning, is typically less prone to odometry error accumulation, and does not consume much memory. Following this idea, this paper introduces PRISM-TopoMap -- a topological mapping method that maintains a graph of locally aligned locations not relying on global metric coordinates. The proposed method involves original learnable multimodal place recognition paired with the scan matching pipeline for localization and loop closure in the graph of locations. The latter is updated online, and the robot is localized in a proper node at each time step. We conduct a broad experimental evaluation of the suggested approach in a range of photo-realistic environments and on a real robot, and compare it to state of the art. The results of the empirical evaluation confirm that PRISM-Topomap consistently outperforms competitors computationally-wise, achieves high mapping quality and performs well on a real robot. The code of PRISM-Topomap is open-sourced and is available at: https://github.com/kiriLLMouraviev/prism-topomap.
- M. Labbé and F. Michaud, “Rtab-map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation,” Journal of Field Robotics, vol. 36, no. 2, pp. 416–446, 2019.
- W. Hess, D. Kohler, H. Rapp, and D. Andor, “Real-time loop closure in 2d lidar slam,” in 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2016, pp. 1271–1278.
- K. Muravyev and K. Yakovlev, “Evaluation of rgb-d slam in large indoor environments,” in Interactive Collaborative Robotics: 7th International Conference, ICR 2022, Fuzhou, China, December 16-18, 2022, Proceedings. Springer, 2022, pp. 93–104.
- C. Gomez, M. Fehr, A. Millane, A. C. Hernandez, J. Nieto, R. Barber, and R. Siegwart, “Hybrid topological and 3d dense mapping through autonomous exploration for large indoor environments,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 9673–9679.
- L. Schmid, V. Reijgwart, L. Ott, J. Nieto, R. Siegwart, and C. Cadena, “A unified approach for autonomous volumetric exploration of large scale environments under severe odometry drift,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4504–4511, 2021.
- F. Blochliger, M. Fehr, M. Dymczyk, T. Schneider, and R. Siegwart, “Topomap: Topological mapping and navigation based on visual slam maps,” in 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 3818–3825.
- P. Yin, S. Zhao, I. Cisneros, A. Abuduweili, G. Huang, M. Milford, C. Liu, H. Choset, and S. Scherer, “General Place Recognition Survey: Towards the Real-world Autonomy Age,” Sept. 2022.
- N. Kim, O. Kwon, H. Yoo, Y. Choi, J. Park, and S. Oh, “Topological semantic graph memory for image-goal navigation,” in Conference on Robot Learning. PMLR, 2023, pp. 393–402.
- K. Muravyev and K. Yakovlev, “Evaluation of topological mapping methods in indoor environments,” IEEE Access, vol. 11, pp. 132 683–132 698, 2023.
- X. Chen, B. Zhou, J. Lin, Y. Zhang, F. Zhang, and S. Shen, “Fast 3d sparse topological skeleton graph generation for mobile robot global planning,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 10 283–10 289.
- N. Hughes, Y. Chang, and L. Carlone, “Hydra: a real-time spatial perception system for 3d scene graph construction and optimization,” 2022.
- Y. Yuan and S. Schwertfeger, “Incrementally building topology graphs via distance maps,” in 2019 IEEE International Conference on Real-time Computing and Robotics (RCAR). IEEE, 2019, pp. 468–474.
- O. Kwon, N. Kim, Y. Choi, H. Yoo, J. Park, and S. Oh, “Visual graph memory with unsupervised representation for visual navigation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15 890–15 899.
- R. R. Wiyatno, A. Xu, and L. Paull, “Lifelong topological visual navigation,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9271–9278, 2022.
- Z. Chen, O. Lam, A. Jacobson, and M. Milford, “Convolutional Neural Network-based Place Recognition,” Nov. 2014.
- R. Arandjelović, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “NetVLAD: CNN architecture for weakly supervised place recognition,” May 2016.
- G. Berton, C. Masone, and B. Caputo, “Rethinking Visual Geo-localization for Large-Scale Applications,” arXiv:2204.02287 [cs], Apr. 2022.
- A. Ali-bey, B. Chaib-draa, and P. Giguère, “MixVPR: Feature Mixing for Visual Place Recognition,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 2998–3007.
- S. Hausler, S. Garg, M. Xu, M. Milford, and T. Fischer, “Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14 141–14 152.
- R. Wang, Y. Shen, W. Zuo, S. Zhou, and N. Zheng, “TransVPR: Transformer-Based Place Recognition With Multi-Level Attention Aggregation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 648–13 657.
- M. A. Uy and G. H. Lee, “PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4470–4479.
- J. Komorowski, “MinkLoc3D: Point Cloud Based Large-Scale Place Recognition,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 1790–1799.
- ——, “Improving Point Cloud Based Place Recognition with Ranking-based Loss and Large Batch Training,” in 2022 26th International Conference on Pattern Recognition (ICPR), Aug. 2022, pp. 3699–3705.
- Z. Fan, Z. Song, H. Liu, Z. Lu, J. He, and X. Du, “SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 1, pp. 551–560, June 2022.
- S. Xie, C. Pan, Y. Peng, K. Liu, and S. Ying, “Large-Scale Place Recognition Based on Camera-LiDAR Fused Descriptor,” Sensors, vol. 20, no. 10, p. 2870, Jan. 2020.
- J. Komorowski, M. Wysoczańska, and T. Trzcinski, “MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition,” in 2021 International Joint Conference on Neural Networks (IJCNN), July 2021, pp. 1–8.
- H. Lai, P. Yin, and S. Scherer, “AdaFusion: Visual-LiDAR Fusion With Adaptive Weights for Place Recognition,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 12 038–12 045, Oct. 2022.
- C. Harris, M. Stephens, et al., “A combined corner and edge detector,” in Alvey vision conference, vol. 15, no. 50. Citeseer, 1988, pp. 10–5244.
- F. Radenović, G. Tolias, and O. Chum, “Fine-tuning cnn image retrieval with no human annotation,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 7, pp. 1655–1668, 2018.
- D. G. Lowe, “Object recognition from local scale-invariant features,” in Proceedings of the seventh IEEE international conference on computer vision, vol. 2. Ieee, 1999, pp. 1150–1157.
- E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “Orb: An efficient alternative to sift or surf,” in 2011 International conference on computer vision. Ieee, 2011, pp. 2564–2571.
- M. Muja and D. Lowe, “Flann-fast library for approximate nearest neighbors user manual,” Computer Science Department, University of British Columbia, Vancouver, BC, Canada, vol. 5, p. 6, 2009.
- S. K. Ramakrishnan, A. Gokaslan, E. Wijmans, O. Maksymets, A. Clegg, J. Turner, E. Undersander, W. Galuba, A. Westbury, A. X. Chang, et al., “Habitat-matterport 3d dataset (hm3d): 1000 large-scale 3d environments for embodied ai,” arXiv preprint arXiv:2109.08238, 2021.
- F. Xia, A. R. Zamir, Z. He, A. Sax, J. Malik, and S. Savarese, “Gibson env: Real-world perception for embodied agents,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9068–9079.
- A. Chang, A. Dai, T. Funkhouser, M. Halber, M. Niessner, M. Savva, S. Song, A. Zeng, and Y. Zhang, “Matterport3d: Learning from rgb-d data in indoor environments,” arXiv preprint arXiv:1709.06158, 2017.
- Z. Qin, H. Yu, C. Wang, Y. Guo, Y. Peng, S. Ilic, D. Hu, and K. Xu, “Geotransformer: Fast and robust point cloud registration with geometric transformer,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,” Advances in Neural Information Processing Systems, vol. 34, pp. 12 077–12 090, 2021.