Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Feudal Networks for Visual Navigation (2402.12498v3)

Published 19 Feb 2024 in cs.CV, cs.LG, and cs.RO

Abstract: Visual navigation follows the intuition that humans can navigate without detailed maps. A common approach is interactive exploration while building a topological graph with images at nodes that can be used for planning. Recent variations learn from passive videos and can navigate using complex social and semantic cues. However, a significant number of training videos are needed, large graphs are utilized, and scenes are not unseen since odometry is utilized. We introduce a new approach to visual navigation using feudal learning, which employs a hierarchical structure consisting of a worker agent, a mid-level manager, and a high-level manager. Key to the feudal learning paradigm, agents at each level see a different aspect of the task and operate at different spatial and temporal scales. Two unique modules are developed in this framework. For the high-level manager, we learn a memory proxy map in a self supervised manner to record prior observations in a learned latent space and avoid the use of graphs and odometry. For the mid-level manager, we develop a waypoint network that outputs intermediate subgoals imitating human waypoint selection during local navigation. This waypoint network is pre-trained using a new, small set of teleoperation videos that we make publicly available, with training environments different from testing environments. The resulting feudal navigation network achieves near SOTA performance, while providing a novel no-RL, no-graph, no-odometry, no-metric map approach to the image goal navigation task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Edward C Tolman. Cognitive maps in rats and men. Psychological review, 55(4):189, 1948.
  2. From cognitive maps to cognitive graphs. PloS one, 9(11):e112544, 2014.
  3. Structuring knowledge with cognitive maps and cognitive graphs. Trends in cognitive sciences, 25(1):37–54, 2021.
  4. The cognitive map in humans: spatial navigation and beyond. Nature neuroscience, 20(11):1504–1513, 2017.
  5. Semi-parametric topological memory for navigation. arXiv preprint arXiv:1803.00653, 2018.
  6. Neural topological slam for visual navigation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12875–12884, 2020a.
  7. Learning to navigate in cities without a map. Advances in neural information processing systems, 31, 2018.
  8. A Behavioral Approach to Visual Navigation with Graph Localization Networks. In Proceedings of Robotics: Science and Systems, FreiburgimBreisgau, Germany, June 2019a. doi: 10.15607/RSS.2019.XV.010.
  9. Navigating to objects in the real world. Science Robotics, 8(79):eadf6991, 2023.
  10. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
  11. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
  12. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
  13. Ving: Learning open-world navigation with visual goals. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13215–13222. IEEE, 2021a.
  14. Search on the replay buffer: Bridging planning and reinforcement learning. Advances in Neural Information Processing Systems, 32, 2019.
  15. Feudal networks for hierarchical reinforcement learning. In International Conference on Machine Learning, pages 3540–3549. PMLR, 2017.
  16. No rl, no simulation: Learning to navigate without navigating. Advances in Neural Information Processing Systems, 34:26661–26673, 2021.
  17. Gibson env: real-world perception for embodied agents. In Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on. IEEE, 2018.
  18. Habitat: A Platform for Embodied AI Research. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
  19. Matterport3D: Learning from RGB-D Data in Indoor Environments. International Conference on 3D Vision (3DV), 2017.
  20. Cognitive Mapping and Planning for Visual Navigation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
  21. Learning To Explore Using Active Neural SLAM. In International Conference on Learning Representations, 2019.
  22. Towards generalization in target-driven visual navigation by using deep reinforcement learning. IEEE Transactions on Robotics, 36(5):1546–1561, 2020.
  23. Maast: Map attention with semantic transformers for efficient visual navigation. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13223–13230. IEEE, 2021.
  24. A behavioral approach to visual navigation with graph localization networks. arXiv preprint arXiv:1903.00445, 2019b.
  25. Offline reinforcement learning for visual navigation. arXiv preprint arXiv:2212.08244, 2022.
  26. Metric-Free Exploration for Topological Mapping by Task and Motion Imitation in Feature Space. arXiv preprint arXiv:2303.09192, 2023.
  27. Rapid exploration for open-world navigation with latent goal models. arXiv preprint arXiv:2104.05859, 2021b.
  28. Topological Semantic Graph Memory for Image-Goal Navigation. In Conference on Robot Learning, pages 393–402. PMLR, 2023.
  29. One-4-All: Neural Potential Fields for Embodied Navigation. arXiv preprint arXiv:2303.04011, 2023.
  30. Poni: Potential functions for objectgoal navigation with interaction-free learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18890–18900, 2022.
  31. Mapnet: An allocentric spatial memory for mapping environments. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8476–8484, 2018.
  32. Learning to map for active semantic goal navigation. arXiv preprint arXiv:2106.15648, 2021.
  33. Object goal navigation using goal-oriented semantic exploration. Advances in Neural Information Processing Systems, 33:4247–4258, 2020b.
  34. Zero experience required: Plug & play modular transfer learning for semantic visual navigation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17031–17041, 2022.
  35. Feudal reinforcement learning. Advances in neural information processing systems, 5, 1992.
  36. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In 2017 IEEE international conference on robotics and automation (ICRA), pages 3357–3364. IEEE, 2017.
  37. Visual memory for robust path following. Advances in neural information processing systems, 31, 2018.
  38. Scene memory transformer for embodied agents in long-horizon tasks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 538–547, 2019.
  39. EgoMap: Projective mapping and structured egocentric memory for Deep RL. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 525–540. Springer, 2020.
  40. Memory-augmented reinforcement learning for image-goal navigation. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3316–3323. IEEE, 2022.
  41. A minimalist approach to offline reinforcement learning. Advances in neural information processing systems, 34:20132–20145, 2021.
  42. Hierarchical imitation and reinforcement learning. In International conference on machine learning, pages 2917–2926. PMLR, 2018.
  43. Options as responses: Grounding behavioural hierarchies in multi-agent reinforcement learning. In International Conference on Machine Learning, pages 9733–9742. PMLR, 2020.
  44. Ask your humans: Using human instructions to improve generalization in reinforcement learning. arXiv preprint arXiv:2011.00517, 2020.
  45. Hrl4in: Hierarchical reinforcement learning for interactive navigation with mobile manipulators. In Conference on Robot Learning, pages 603–616. PMLR, 2020.
  46. Goal-conditioned reinforcement learning with imagined subgoals. In International Conference on Machine Learning, pages 1430–1440. PMLR, 2021.
  47. Hierarchical robot navigation in novel environments using rough 2-d maps. arXiv preprint arXiv:2106.03665, 2021.
  48. Hierarchies of planning and reinforcement learning for robot navigation. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 10682–10688. IEEE, 2021.
  49. Habitat 2.0: Training Home Assistants to Rearrange their Habitat. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
  50. Superglue: Learning feature matching with graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4938–4947, 2020.
  51. Unsupervised visual representation learning by synchronous momentum grouping. In European Conference on Computer Vision, pages 265–282. Springer, 2022.
  52. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  53. Dd-ppo: Learning near-perfect pointgoal navigators from 2.5 billion frames. arXiv preprint arXiv:1911.00357, 2019.
  54. Renderable Neural Radiance Map for Visual Navigation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9099–9108, 2023.
  55. Offline visual representation learning for embodied navigation. arXiv preprint arXiv:2204.13226, 2022.
  56. Why does hierarchy (sometimes) work so well in reinforcement learning? arXiv preprint arXiv:1909.10618 (2019), 2019.
  57. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  58. Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems, 33:9912–9924, 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Faith Johnson (8 papers)
  2. Bryan Bo Cao (9 papers)
  3. Kristin Dana (27 papers)
  4. Shubham Jain (40 papers)
  5. Ashwin Ashok (12 papers)

Summary

We haven't generated a summary for this paper yet.