Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
132 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PlaceNav: Topological Navigation through Place Recognition (2309.17260v4)

Published 29 Sep 2023 in cs.RO, cs.AI, and cs.LG

Abstract: Recent results suggest that splitting topological navigation into robot-independent and robot-specific components improves navigation performance by enabling the robot-independent part to be trained with data collected by robots of different types. However, the navigation methods' performance is still limited by the scarcity of suitable training data and they suffer from poor computational scaling. In this work, we present PlaceNav, subdividing the robot-independent part into navigation-specific and generic computer vision components. We utilize visual place recognition for the subgoal selection of the topological navigation pipeline. This makes subgoal selection more efficient and enables leveraging large-scale datasets from non-robotics sources, increasing training data availability. Bayesian filtering, enabled by place recognition, further improves navigation performance by increasing the temporal consistency of subgoals. Our experimental results verify the design and the new method obtains a 76% higher success rate in indoor and 23% higher in outdoor navigation tasks with higher computational efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. R. Brooks, “Visual map making for a mobile robot,” in 1985 IEEE International Conference on Robotics and Automation Proceedings, vol. 2, Mar. 1985, pp. 824–829.
  2. E. Baumgartner and S. Skaar, “An autonomous vision-based mobile robot,” IEEE Transactions on Automatic Control, vol. 39, no. 3, pp. 493–502, Mar. 1994, conference Name: IEEE Transactions on Automatic Control.
  3. S. Thrun, “Learning metric-topological maps for indoor mobile robot navigation,” Artificial Intelligence, vol. 99, no. 1, pp. 21–71, Feb. 1998. https://www.sciencedirect.com/science/article/pii/S0004370297000787
  4. T. Goedemé, M. Nuttin, T. Tuytelaars, and L. Van Gool, “Omnidirectional Vision Based Topological Navigation,” International Journal of Computer Vision, vol. 74, no. 3, pp. 219–236, Sept. 2007. https://doi.org/10.1007/s11263-006-0025-9
  5. D. Filliat, “Interactive learning of visual topological navigation,” in 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sept. 2008, pp. 248–254, iSSN: 2153-0866.
  6. A. Rosinol, M. Abate, Y. Chang, and L. Carlone, “Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping,” in 2020 IEEE International Conference on Robotics and Automation (ICRA), May 2020, pp. 1689–1696, iSSN: 2577-087X.
  7. P. Mirowski, R. Pascanu, F. Viola, H. Soyer, A. Ballard, A. Banino, M. Denil, R. Goroshin, L. Sifre, K. Kavukcuoglu, D. Kumaran, and R. Hadsell, “Learning to Navigate in Complex Environments,” in International Conference on Learning Representations (ICLR)), Nov. 2016. https://openreview.net/forum?id=SJMGPrcle
  8. N. Savinov, A. Dosovitskiy, and V. Koltun, “Semi-parametric topological memory for navigation,” in International Conference on Learning Representations (ICLR), 2018. https://openreview.net/forum?id=SygwwGbRW
  9. N. Savinov, A. Raichuk, D. Vincent, R. Marinier, M. Pollefeys, T. Lillicrap, and S. Gelly, “Episodic Curiosity through Reachability,” in International Conference on Learning Representations (ICLR), 2018. https://openreview.net/forum?id=SkeK3s0qKQ
  10. D. Shah, B. Eysenbach, G. Kahn, N. Rhinehart, and S. Levine, “ViNG: Learning Open-World Navigation with Visual Goals,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE Press, 2021, pp. 13 215–13 222. https://doi.org/10.1109/ICRA48506.2021.9561936
  11. L. Mezghan, S. Sukhbaatar, T. Lavril, O. Maksymets, D. Batra, P. Bojanowski, and K. Alahari, “Memory-Augmented Reinforcement Learning for Image-Goal Navigation,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct. 2022, pp. 3316–3323, iSSN: 2153-0866.
  12. D. Shah, A. Sridhar, A. Bhorkar, N. Hirose, and S. Levine, “GNM: A General Navigation Model to Drive Any Robot,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), May 2023, pp. 7226–7233.
  13. D. Shah, A. Sridhar, N. Dashora, K. Stachowicz, K. Black, N. Hirose, and S. Levine, “ViNT: A Large-Scale, Multi-Task Visual Navigation Backbone with Cross-Robot Generalization,” in Proceedings of the 7th Annual Conference on Robot Learning.   PMLR, Aug. 2023. https://openreview.net/forum?id=-K7-1WvKO3F
  14. A. Kar, A. Prakash, M.-Y. Liu, E. Cameracci, J. Yuan, M. Rusiniak, D. Acuna, A. Torralba, and S. Fidler, “Meta-Sim: Learning to Generate Synthetic Datasets,” in IEEE/CVF International Conference on Computer Vision (CVPR), 2019, pp. 4551–4560. https://openaccess.thecvf.com/content˙ICCV˙2019/html/Kar˙Meta-Sim˙Learning˙to˙Generate˙Synthetic˙Datasets˙ICCV˙2019˙paper.html
  15. C. Masone and B. Caputo, “A Survey on Deep Visual Place Recognition,” IEEE Access, vol. 9, pp. 19 516–19 547, 2021.
  16. D. Singh Chaplot, R. Salakhutdinov, A. Gupta, and S. Gupta, “Neural Topological SLAM for Visual Navigation,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).   Seattle, WA, USA: IEEE, June 2020, pp. 12 872–12 881. https://ieeexplore.ieee.org/document/9157610/
  17. X. Meng, N. Ratliff, Y. Xiang, and D. Fox, “Scaling Local Control to Large-Scale Topological Navigation,” in 2020 IEEE International Conference on Robotics and Automation (ICRA), May 2020, pp. 672–678, iSSN: 2577-087X.
  18. B. Eysenbach, R. R. Salakhutdinov, and S. Levine, “Search on the Replay Buffer: Bridging Planning and Reinforcement Learning,” in Advances in Neural Information Processing Systems, vol. 32.   Curran Associates, Inc., 2019. https://proceedings.neurips.cc/paper/2019/hash/5c48ff18e0a47baaf81d8b8ea51eec92-Abstract.html
  19. D. Shah, B. Eysenbach, N. Rhinehart, and S. Levine, “Rapid Exploration for Open-World Navigation with Latent Goal Models,” in Proceedings of the 5th Conference on Robot Learning.   PMLR, Jan. 2022, pp. 674–684, iSSN: 2640-3498. https://proceedings.mlr.press/v164/shah22a.html
  20. N. Hirose, D. Shah, A. Sridhar, and S. Levine, “SACSoN: Scalable Autonomous Control for Social Navigation,” IEEE Robotics and Automation Letters, vol. 9, no. 1, pp. 49–56, Jan. 2024. https://ieeexplore.ieee.org/document/10305270
  21. S. Triest, M. Sivaprakasam, S. J. Wang, W. Wang, A. M. Johnson, and S. Scherer, “TartanDrive: A Large-Scale Dataset for Learning Off-Road Dynamics Models,” in 2022 International Conference on Robotics and Automation (ICRA), May 2022, pp. 2546–2552.
  22. Sivic and Zisserman, “Video Google: a text retrieval approach to object matching in videos,” in EEE/CVF International Conference on Computer Vision (ICCV), Oct. 2003, pp. 1470–1477 vol.2.
  23. H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation,” in IEEE/CVF International Conference on Computer Vision (CVPR), June 2010, pp. 3304–3311, iSSN: 1063-6919.
  24. A. Torii, R. Arandjelović, J. Sivic, M. Okutomi, and T. Pajdla, “24/7 place recognition by view synthesis,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015, pp. 1808–1817, iSSN: 1063-6919.
  25. R. Arandjelović, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “NetVLAD: CNN Architecture for Weakly Supervised Place Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1437–1451, June 2018.
  26. J. Revaud, J. Almazan, R. Rezende, and C. D. Souza, “Learning With Average Precision: Training Image Retrieval With a Listwise Loss,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV).   Seoul, Korea (South): IEEE, Oct. 2019, pp. 5106–5115. https://ieeexplore.ieee.org/document/9010047/
  27. G. Berton, C. Masone, and B. Caputo, “Rethinking Visual Geo-localization for Large-Scale Applications,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).   New Orleans, LA, USA: IEEE, June 2022, pp. 4868–4878. https://ieeexplore.ieee.org/document/9880209/
  28. F. Radenović, G. Tolias, and O. Chum, “Fine-Tuning CNN Image Retrieval with No Human Annotation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 7, pp. 1655–1668, July 2019.
  29. M. J. Milford and G. F. Wyeth, “SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights,” in 2012 IEEE International Conference on Robotics and Automation (ICRA), May 2012, pp. 1643–1649, iSSN: 1050-4729.
  30. P. Hansen and B. Browning, “Visual place recognition using HMM sequence matching,” in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sept. 2014, pp. 4549–4555, iSSN: 2153-0866.
  31. E. Pepperell, P. I. Corke, and M. J. Milford, “All-environment visual place recognition with SMART,” in 2014 IEEE International Conference on Robotics and Automation (ICRA), May 2014, pp. 1612–1618, iSSN: 1050-4729.
  32. T. Naseer, W. Burgard, and C. Stachniss, “Robust Visual Localization Across Seasons,” IEEE Transactions on Robotics, vol. 34, no. 2, pp. 289–302, Apr. 2018.
  33. A. Gawel, C. D. Don, R. Siegwart, J. Nieto, and C. Cadena, “X-View: Graph-Based Semantic Multi-View Localization,” IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 1687–1694, July 2018.
  34. Y. Latif, R. Garg, M. Milford, and I. Reid, “Addressing Challenging Place Recognition Tasks Using Generative Adversarial Networks,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), May 2018, pp. 2349–2355, iSSN: 2577-087X.
  35. S. Garg and M. Milford, “SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition,” IEEE Robotics and Automation Letters, vol. PP, pp. 1–1, 2021.
  36. M. Xu, N. Snderhauf, and M. Milford, “Probabilistic Visual Place Recognition for Hierarchical Localization,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 311–318, Apr. 2021.
  37. M. Xu, T. Fischer, N. Sünderhauf, and M. Milford, “Probabilistic Appearance-Invariant Topometric Localization With New Place Awareness,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 6985–6992, Oct. 2021.
  38. N. Carlevaris-Bianco, A. Ushani, and R. Eustice, “University of Michigan North Campus long-term vision and lidar dataset,” The International Journal of Robotics Research, vol. 35, Dec. 2015.
  39. Z. Chen, A. Jacobson, N. Sünderhauf, B. Upcroft, L. Liu, C. Shen, I. Reid, and M. Milford, “Deep learning features at scale for visual place recognition,” in 2017 IEEE International Conference on Robotics and Automation (ICRA), May 2017, pp. 3223–3230.
  40. F. Warburg, S. Hauberg, M. Lopez-Antequera, P. Gargallo, Y. Kuang, and J. Civera, “Mapillary Street-Level Sequences: A Dataset for Lifelong Place Recognition,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 2626–2635. https://openaccess.thecvf.com/content˙CVPR˙2020/html/Warburg˙Mapillary˙Street-Level˙Sequences˙A˙Dataset˙for˙Lifelong˙Place˙Recognition˙CVPR˙2020˙paper.html
  41. J. Johnson, M. Douze, and H. Jégou, “Billion-Scale Similarity Search with GPUs,” IEEE Transactions on Big Data, vol. 7, no. 3, pp. 535–547, July 2021, conference Name: IEEE Transactions on Big Data.
  42. S. Thrun, D. Fox, W. Burgard, and F. Dellaert, “Robust Monte Carlo localization for mobile robots,” Artificial Intelligence, vol. 128, no. 1, pp. 99–141, May 2001. https://www.sciencedirect.com/science/article/pii/S0004370201000698
  43. M. Tan and Q. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” in Proceedings of the 36th International Conference on Machine Learning (ICML).   PMLR, May 2019, pp. 6105–6114, iSSN: 2640-3498. https://proceedings.mlr.press/v97/tan19a.html
  44. P. Anderson, A. Chang, D. S. Chaplot, A. Dosovitskiy, S. Gupta, V. Koltun, J. Kosecka, J. Malik, R. Mottaghi, M. Savva, and A. R. Zamir, “On Evaluation of Embodied Navigation Agents,” arXiv:1807.06757 [cs], July 2018. http://arxiv.org/abs/1807.06757
  45. J. Wasserman, K. Yadav, G. Chowdhary, A. Gupta, and U. Jain, “Last-Mile Embodied Visual Navigation,” in Proceedings of The 6th Conference on Robot Learning.   PMLR, Mar. 2023, pp. 666–678, iSSN: 2640-3498. https://proceedings.mlr.press/v205/wasserman23a.html
  46. G. Berton, R. Mereu, G. Trivigno, C. Masone, G. Csurka, T. Sattler, and B. Caputo, “Deep Visual Geo-Localization Benchmark,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 5396–5407. https://openaccess.thecvf.com/content/CVPR2022/html/Berton˙Deep˙Visual˙Geo-Localization˙Benchmark˙CVPR˙2022˙paper.html
  47. N. Hirose, F. Xia, R. Martín-Martín, A. Sadeghian, and S. Savarese, “Deep Visual MPC-Policy Learning for Navigation,” IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 3184–3191, Oct. 2019.
  48. H. Jegou, M. Douze, and C. Schmid, “On the burstiness of visual elements,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).   IEEE Computer Society, June 2009, pp. 1169–1176, iSSN: 1063-6919. https://www.computer.org/csdl/proceedings-article/cvpr/2009/05206609/12OmNzdoMkb
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com