Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation (2407.15304v1)

Published 22 Jul 2024 in cs.RO and cs.CV

Abstract: In appearance-based localization and mapping, loop closure detection is the process used to determinate if the current observation comes from a previously visited location or a new one. As the size of the internal map increases, so does the time required to compare new observations with all stored locations, eventually limiting online processing. This paper presents an online loop closure detection approach for large-scale and long-term operation. The approach is based on a memory management method, which limits the number of locations used for loop closure detection so that the computation time remains under real-time constraints. The idea consists of keeping the most recent and frequently observed locations in a Working Memory (WM) used for loop closure detection, and transferring the others into a Long-Term Memory (LTM). When a match is found between the current location and one stored in WM, associated locations stored in LTM can be updated and remembered for additional loop closure detections. Results demonstrate the approach's adaptability and scalability using ten standard data sets from other appearance-based loop closure approaches, one custom data set using real images taken over a 2 km loop of our university campus, and one custom data set (7 hours) using virtual images from the racing video game ``Need for Speed: Most Wanted''.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit, “FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges,” in Proc. Int. Joint Conf. on Artificial Intelligence, vol. 18, 2003, pp. 1151–1156.
  2. M. Bosse, P. Newman, J. Leonard, and S. Teller, “Simultaneous localization and map building in large-scale cyclic environments using the atlas framework,” Int. J. of Robotics Research, vol. 23, no. 12, pp. 1113–39, December 2004.
  3. A. I. Eliazar and R. Parr, “DP-SLAM 2.0,” in Proc. IEEE Int. Conf. on Robotics and Automation, 2004, pp. 1314–20.
  4. C. Estrada, J. Neira, and J. D. Tardós, “Hierarchical SLAM: Real-time accurate mapping of large environments,” IEEE Trans. on Robotics, vol. 21, no. 4, pp. 588–596, August 2005.
  5. J. Folkesson and H. I. Christensen, “Closing the loop with graphical SLAM,” IEEE Trans. on Robotics, vol. 23, no. 4, pp. 731–41, August 2007.
  6. L. Clemente, A. Davison, I. Reid, J. Neira, and J. Tardós, “Mapping large loops with a single hand-held camera,” in Proc. of Robotics: Science and Systems, Atlanta, GA, USA, June 2007.
  7. G. Grisetti, C. Stachniss, and W. Burgard, “Improved techniques for grid mapping with Rao-Blackwellized particle filters,” IEEE Trans. on Robotics, vol. 23, no. 1, pp. 34–46, February 2007.
  8. J.-L. Blanco, J. Fernandez-Madrigal, and J. Gonzalez, “Toward a unified Bayesian approach to hybrid metric–topological SLAM,” IEEE Trans. on Robotics, vol. 24, no. 2, pp. 259–270, April 2008.
  9. J. Callmer, K. Granström, J. Nieto, and F. Ramos, “Tree of words for visual loop closure detection in urban SLAM,” in Proc. Australasian Conf. on Robotics and Automation, 2008, p. 8.
  10. L. M. Paz, J. D. Tardós, and J. Neira, “Divide and conquer: EKF SLAM in O(n),” IEEE Trans. on Robotics, vol. 24, no. 5, pp. 1107–1120, 2008.
  11. P. Pinies and J. D. Tardós, “Large-scale SLAM building conditionally independent local maps: Application to monocular vision,” IEEE Trans. on Robotics, vol. 24, no. 5, pp. 1094–1106, October 2008.
  12. D. Schleicher, L. Bergasa, M. Ocaña, R. Barea, and E. López, “Real-time hierarchical stereo Visual SLAM in large-scale environments,” Robotics and Autonomous Systems, vol. 58, no. 8, pp. 991–1002, 2010.
  13. A. Davison, I. Reid, N. Molton, and O. Stasse, “MonoSLAM: Real-time single camera SLAM,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1052–1067, June 2007.
  14. P. Newman, D. Cole, and K. Ho, “Outdoor SLAM using visual appearance and laser ranging,” in Proc. IEEE Int. Conf. on Robotics and Automation, 2006, pp. 1180–7.
  15. M. Cummins and P. Newman, “FAB-MAP: probabilistic localization and mapping in the space of appearance,” The Int. J. of Robotics Research, vol. 27, no. 6, pp. 647–65, June 2008.
  16. A. Angeli, S. Doncieux, J. Meyer, and D. Filliat, “Incremental vision-based topological SLAM,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2008, pp. 1031–1036.
  17. T. Botterill, S. Mills, and R. Green, “Bag-of-words-driven, single-camera simultaneous localization and mapping,” J. of Field Robotics, vol. 28, no. 2, pp. 204–226, 2011.
  18. K. Konolige, J. Bowman, J. Chen, P. Mihelich, M. Calonder, V. Lepetit, and P. Fua, “View-based maps,” The Int. J. of Robotics Research, vol. 29, no. 8, pp. 941–957, July 2010.
  19. R. Atkinson and R. Shiffrin, “Human memory: A proposed system and its control processes,” in Psychology of Learning and Motivation: Advances in Research and Theory.   Elsevier, 1968, vol. 2, pp. 89–195.
  20. M. Labbé and F. Michaud, “Memory management for real-time appearance-based loop closure detection,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2011, pp. 1271–1276.
  21. M. J. Milford and G. F. Wyeth, “Mapping a suburb with a single camera using a biologically inspired SLAM system,” IEEE Trans. on Robotics, vol. 24, no. 5, pp. 1038–1053, October 2008.
  22. P. Newman and K. Ho, “SLAM- loop closing with visually salient features,” in Proc. IEEE Int. Conf. on Robotics and Automation, 2005, pp. 635–642.
  23. A. Tapus and R. Siegwart, “Topological SLAM,” in Springer Tracts in Advanced Robotics.   Springer, 2008, vol. 46, pp. 99–127.
  24. M. Bosse and J. Roberts, “Histogram matching and global initialization for laser-only SLAM in large unstructured environments,” in Proc. IEEE Int. Conf. on Robotics and Automation, 2007, pp. 4820–4826.
  25. J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” in Proc. 9th Int. Conf. on Computer Vision, Nice, France, 2003, pp. 1470–1478.
  26. D. Nistèr and H. Stewènius, “Scalable recognition with a vocabulary tree,” in Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, 2006, pp. 2161–2168.
  27. A. Angeli, D. Filliat, S. Doncieux, and J.-A. Meyer, “Fast and incremental method for loop-closure detection using bags of visual words,” IEEE Trans. on Robotics, vol. 24, no. 5, pp. 1027–1037, October 2008.
  28. M. Cummins and P. Newman, “Highly scalable appearance-only SLAM–FAB-MAP 2.0,” in Proc. of Robotics: Science and Systems, Seattle, USA, June 2009.
  29. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
  30. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2007, pp. 1–8.
  31. O. Booij, Z. Zivkovic, and B. Kröse, “Efficient data association for view based SLAM using connected dominating sets,” Robotics and Autonomous Systems, vol. 57, no. 12, pp. 1225–1234, 2009.
  32. A. Ranganathan, “Pliss: Detecting and labeling places using online change-point detection,” in Proc. of Robotics: Science and Systems, 2010.
  33. A. J. Glover, W. P. Maddern, M. J. Milford, and G. F. Wyeth, “FAB-MAP + RatSLAM: Appearance-based SLAM for multiple times of day,” in Proc. IEEE Int. Conf. on Robotics and Automation, 2010, pp. 3507–3512.
  34. M. Milford and G. Wyeth, “Persistent navigation and mapping using a biologically inspired SLAM system,” The Int. J. of Robotics Research, vol. 29, no. 9, pp. 1131–53, August 2010.
  35. K. Konolige and J. Bowman, “Towards lifelong visual maps,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2009, pp. 1156–1163.
  36. F. Dayoub and T. Duckett, “An adaptive appearance-based map for long-term topological localization of mobile robots,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2008, pp. 3364–9.
  37. A. Pronobis, L. Jie, and B. Caputo, “The more you learn, the less you store: Memory-controlled incremental svm for visual place recognition,” Image and Vision Computing, vol. 28, no. 7, pp. 1080–1097, 2010.
  38. G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
  39. H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “Speeded Up Robust Features (SURF),” Computer Vision and Image Understanding, vol. 110, no. 3, pp. 346–359, 2008.
  40. M. Muja and D. G. Lowe, “Fast approximate nearest neighbors with automatic algorithm configuration,” in Proc. Int. Conf. on Computer Vision Theory and Application, 2009, pp. 331–340.
  41. M. Smith, I. Baldwin, W. Churchill, R. Paul, and P. Newman, “The new college vision and laser data set,” The International Journal of Robotics Research, vol. 28, no. 5, pp. 595–599, 2009.
  42. A. Kawewong, N. Tongprasit, and O. Hasegawa, “PIRF-Nav 2.0: Fast and online incremental appearance-based loop-closure detection in an indoor environment,” Robotics and Autonomous Systems, vol. 59, no. 10, pp. 727 – 739, 2011.
  43. S. Ceriani, G. Fontana, A. Giusti, D. Marzorati, M. Matteucci, D. Migliore, D. Rizzi, D. Sorrenti, and P. Taddei, “Rawseeds ground truth collection systems for indoor self-localization and mapping,” Autonomous robots, vol. 27, no. 4, pp. 353–371, 2009.
  44. P. Newman, G. Sibley, M. Smith, M. Cummins, A. Harrison, C. Mei, I. Posner, R. Shade, D. Schroeter, L. Murphy et al., “Navigating, recognizing and describing urban spaces with vision and lasers,” The International Journal of Robotics Research, vol. 28, no. 11-12, p. 1406, 2009.
  45. D. Gálvez-López and J. Tardós, “Real-time loop detection with bags of binary words,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2011, pp. 51–58.
  46. W. Maddern, M. Milford, and G. Wyeth, “Continuous appearance-based trajectory SLAM,” in Proc. IEEE Int. Conf. on Robotics and Automation, 2011, pp. 3595–3600.
Citations (343)

Summary

  • The paper introduces a dynamic dual-memory architecture for SLAM that efficiently identifies loop closures during online, long-term operations.
  • It employs a bag-of-words model with dynamic Bayesian filtering to balance computational load and achieve high recall at 100% precision.
  • Empirical tests on diverse datasets validate its robustness under varying conditions, supporting scalable and continuous autonomous navigation.

Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation

The paper "Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation" by Mathieu Labbe and François Michaud explores a sophisticated approach for real-time loop closure detection in autonomous robotic navigation using appearance-based methods. The primary innovation presented is the integration of a dynamic memory management system that addresses the challenges of scalability and adaptability in simultaneous localization and mapping (SLAM) over extended operational periods and expansive areas.

Core Methodology

The proposed approach is constructed around a dual memory system that includes Working Memory (WM) and Long-Term Memory (LTM). WM is used for active loop closure detection, while LTM stores additional location data, which can be accessed when needed. This system limits the computational load associated with handling large datasets by maintaining only the most recent and frequently observed locations in WM for comparison against new inputs. When a loop closure event is positively identified, relevant locations from LTM are retrieved to support further detection, thereby enhancing the probability of subsequent loop closures.

The technique utilizes a bag-of-words (BoW) model and integrates a dynamic Bayesian filtering method, enabling it to effectively manage the trade-off between the time required to search through previously visited locations and the overall size of the mapped environment. This strategy supports autonomous robots in maintaining real-time processing capabilities, reflected by the algorithm's ability to adapt memory usage dynamically in response to computational demands.

Experimental Results

The paper presents empirical results from tests using diverse data sets, including well-known SLAM benchmarks as well as custom environments such as a university campus and a video game-generated cityscape. Results demonstrate that the system can achieve high recall rates at 100% precision, comparable to or exceeding existing appearance-based methods for loop closure detection. Importantly, the system consistently adheres to real-time processing constraints, with maximum processing times remaining below the acquisition time intervals.

A particular strength of the system is its resilience to varied environmental conditions. For example, tests conducted under varying illumination and dynamic atmospheric conditions affirm the robustness of the approach. The system demonstrates improved recall performance due to the capability to dynamically retrieve and transfer location data between WM and LTM, allowing it to adapt to new or changing environmental features efficiently.

Implications and Future Directions

The implications of this research are significant for the future deployment of autonomous systems in dynamic and large-scale environments. The ability to maintain a real-time processing workflow, without compromising the accuracy of loop closure detection, supports the continuous operation of robots over extended periods. This is particularly crucial for applications requiring long-term autonomy such as surveillance, exploration, and search and rescue missions.

Future developments could focus on optimizing the computational efficiency of the retrieval and transfer processes further, possibly by integrating more advanced feature descriptors or machine learning models tailored for dynamic environments. Additionally, exploring different heuristics for memory management, such as adaptive techniques informed by real-time operational context or user-defined priorities, could yield enhanced performance and adaptability.

In conclusion, the paper by Labbe and Michaud contributes a notable advancement to the field of SLAM by presenting a scalable and adaptive solution for loop closure detection that is both efficient and effective over long-term operations. This research not only extends the capabilities of autonomous navigational systems but also lays the groundwork for further innovation in robust real-time mapping technologies.