NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration (2310.07896v1)
Abstract: Robotic learning for navigation in unfamiliar environments needs to provide policies for both task-oriented navigation (i.e., reaching a goal that the robot has located), and task-agnostic exploration (i.e., searching for a goal in a novel setting). Typically, these roles are handled by separate models, for example by using subgoal proposals, planning, or separate navigation strategies. In this paper, we describe how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration, with the latter providing the ability to search novel environments, and the former providing the ability to reach a user-specified goal once it has been located. We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments, as compared to approaches that use subgoal proposals from generative models, or prior methods based on latent variable models. We instantiate our method by using a large-scale Transformer-based policy trained on data from multiple ground robots, with a diffusion model decoder to flexibly handle both goal-conditioned and goal-agnostic navigation. Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods, and demonstrate significant improvements in performance and lower collision rates, despite utilizing smaller models than state-of-the-art approaches. For more videos, code, and pre-trained model checkpoints, see https://general-navigation-models.github.io/nomad/
- A. Faust, K. Oslund, O. Ramirez, A. Francis, L. Tapia, M. Fiser, and J. Davidson, “Prm-rl: Long-range robotic navigation tasks by combining reinforcement learning and sampling-based planning,” in 2018 International Conference on Robotics and Automation (ICRA), 2018.
- C. Li, F. Xia, R. Martín-Martín, and S. Savarese, “HRL4IN: hierarchical reinforcement learning for interactive navigation with mobile manipulators,” in 3rd Annual Conference on Robot Learning (CoRL), 2019.
- D. Shah, A. Sridhar, N. Dashora, K. Stachowicz, K. Black, N. Hirose, and S. Levine, “ViNT: A Foundation Model for Visual Navigation,” in 7th Annual Conference on Robot Learning (CoRL), 2023.
- B. Kuipers and Y.-T. Byun, “A robot exploration and mapping strategy based on a semantic hierarchy of spatial representations,” Robotics and Autonomous Systems, 1991, special Issue Toward Learning Robots.
- F. Bourgault, A. A. Makarenko, S. B. Williams, B. Grocholsky, and H. F. Durrant-Whyte, “Information based adaptive robotic exploration,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2002.
- T. Kollar and N. Roy, “Efficient optimization of information-theoretic exploration in slam,” in AAAI Conference on Artificial Intelligence (AAAI), 2008.
- W. Tabib, K. Goel, J. Yao, M. Dabhi, C. Boirum, and N. Michael, “Real-time information-theoretic exploration with gaussian mixture model maps.” in Robotics: Science and Systems, 2019.
- B. Yamauchi, “A frontier-based approach for autonomous exploration,” in IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA), 1997.
- B. Charrow, S. Liu, V. Kumar, and N. Michael, “Information-theoretic mapping using cauchy-schwarz quadratic mutual information,” in International Conference on Robotics and Automation (ICRA), 2015.
- D. Holz, N. Basilico, F. Amigoni, and S. Behnke, “A comparative evaluation of exploration strategies and heuristics to improve them,” 2011.
- D. Shah, B. Eysenbach, G. Kahn, N. Rhinehart, and S. Levine, “ViNG: Learning Open-World Navigation with Visual Goals,” in International Conference on Robotics and Automation (ICRA), 2021.
- X. Meng, N. Ratliff, Y. Xiang, and D. Fox, “Scaling Local Control to Large-Scale Topological Navigation,” in International Conference on Robotics and Automation (ICRA), 2020.
- D. S. Chaplot, H. Jiang, S. Gupta, and A. Gupta, “Semantic curiosity for active visual learning,” in ECCV, 2020.
- T. Chen, S. Gupta, and A. Gupta, “Learning exploration policies for navigation,” in International Conference on Learning Representations (ICLR), 2019.
- H. Tan, L. Yu, and M. Bansal, “Learning to navigate unseen environments: Back translation with environmental dropout,” in 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)), 2019.
- A. Kadian, J. Truong, A. Gokaslan, A. Clegg, E. Wijmans, S. Lee, M. Savva, S. Chernova, and D. Batra, “Sim2Real Predictivity: Does Evaluation in Simulation Predict Real-World Performance?” IEEE Robotics and Automation Letters, 2020.
- D. Shah, B. Eysenbach, N. Rhinehart, and S. Levine, “Rapid exploration for open-world navigation with latent goal models,” in Conference on Robot Learning (CoRL), 2021.
- D. S. Chaplot, R. Salakhutdinov, A. Gupta, and S. Gupta, “Neural topological slam for visual navigation,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- D. Pathak, P. Agrawal, A. A. Efros, and T. Darrell, “Curiosity-driven exploration by self-supervised prediction,” in International Conference on Machine Learning, 2017.
- A. Khazatsky, A. Nair, D. Jing, and S. Levine, “What can i do here? learning new skills by imagining visual affordances,” in IEEE International Conference on Robotics and Automation (ICRA), 2021.
- K. Fang, P. Yin, A. Nair, H. R. Walke, G. Yan, and S. Levine, “Generalization with lossy affordances: Leveraging broad offline data for learning visuomotor tasks,” in 6th Annual Conference on Robot Learning, 2022.
- T. Gervet, S. Chintala, D. Batra, J. Malik, and D. S. Chaplot, “Navigating to objects in the real world,” Science Robotics, 2023.
- J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” in Neural Information Processing Systems, 2020.
- L. Metz, J. Ibarz, N. Jaitly, and J. Davidson, “Discrete sequential prediction of continuous actions for deep rl,” 2019.
- R. Dadashi, L. Hussenot, D. Vincent, S. Girgin, A. Raichuk, M. Geist, and O. Pietquin, “Continuous control with action quantization from demonstrations,” in 39th International Conference on Machine Learning (ICML), 2022.
- N. M. M. Shafiullah, Z. J. Cui, A. Altanzaya, and L. Pinto, “Behavior transformers: Cloning $k$ modes with one stone,” in Advances in Neural Information Processing Systems (NeurIPS), A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, Eds., 2022.
- Y. Chebotar et al., “Q-transformer: Scalable offline reinforcement learning via autoregressive q-functions,” in 7th Annual Conference on Robot Learning (CoRL), 2023.
- P. Florence, C. Lynch, A. Zeng, O. A. Ramirez, A. Wahid, L. Downs, A. Wong, J. Lee, I. Mordatch, and J. Tompson, “Implicit behavioral cloning,” in 5th Annual Conference on Robot Learning (CoRL), 2021.
- M. Janner, Y. Du, J. Tenenbaum, and S. Levine, “Planning with diffusion for flexible behavior synthesis,” in International Conference on Machine Learning (ICML), 2022.
- Z. Wang, J. J. Hunt, and M. Zhou, “Diffusion policies as an expressive policy class for offline reinforcement learning,” in The Eleventh International Conference on Learning Representations (ICLR), 2023.
- C. Chi, S. Feng, Y. Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” in Robotics: Science and Systems (RSS), 2023.
- T. Pearce, T. Rashid, A. Kanervisto, D. Bignell, M. Sun, R. Georgescu, S. V. Macua, S. Z. Tan, I. Momennejad, K. Hofmann, and S. Devlin, “Imitating human behaviour with diffusion models,” in The Eleventh International Conference on Learning Representations (ICLR), 2023.
- M. Reuss, M. Li, X. Jia, and R. Lioutikov, “Goal conditioned imitation learning using score-based diffusion policies,” in Robotics: Science and Systems (RSS), 2023.
- P. Hansen-Estruch, I. Kostrikov, M. Janner, J. G. Kuba, and S. Levine, “Idql: Implicit q-learning as an actor-critic method with diffusion policies,” 2023.
- D. Shah and S. Levine, “ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints,” in Robotics: Science and Systems (RSS), 2022.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” 2017.
- A. Brohan et al., “Rt-1: Robotics transformer for real-world control at scale,” arXiv preprint, 2022.
- N. Nayakanti, R. Al-Rfou, A. Zhou, K. Goel, K. S. Refaat, and B. Sapp, “Wayformer: Motion forecasting via simple and efficient attention networks,” 2022.
- M. Tan and Q. V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” 2020.
- D. Shah, A. Sridhar, A. Bhorkar, N. Hirose, and S. Levine, “GNM: A General Navigation Model to Drive Any Robot,” in International Conference on Robotics and Automation (ICRA), 2023.
- N. Hirose, D. Shah, A. Sridhar, and S. Levine, “Sacson: Scalable autonomous data collection for social navigation,” arXiv, 2023.
- A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in International Conference on Machine Learning (ICML), 2021.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations (ICLR), 2019.
- Y. Bengio, N. Léonard, and A. C. Courville, “Estimating or propagating gradients through stochastic neurons for conditional computation,” ArXiv, 2013.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations (ICLR), 2021.