Learning Hierarchical Interactive Multi-Object Search for Mobile Manipulation (2307.06125v3)
Abstract: Existing object-search approaches enable robots to search through free pathways, however, robots operating in unstructured human-centered environments frequently also have to manipulate the environment to their needs. In this work, we introduce a novel interactive multi-object search task in which a robot has to open doors to navigate rooms and search inside cabinets and drawers to find target objects. These new challenges require combining manipulation and navigation skills in unexplored environments. We present HIMOS, a hierarchical reinforcement learning approach that learns to compose exploration, navigation, and manipulation skills. To achieve this, we design an abstract high-level action space around a semantic map memory and leverage the explored environment as instance navigation points. We perform extensive experiments in simulation and the real world that demonstrate that, with accurate perception, the decision making of HIMOS effectively transfers to new environments in a zero-shot manner. It shows robustness to unseen subpolicies, failures in their execution, and different robot kinematics. These capabilities open the door to a wide range of downstream tasks across embodied AI and real-world use cases.
- S. Wani, S. Patel, U. Jain, A. Chang, and M. Savva, “Multion: Benchmarking semantic map memory using multi-object navigation,” Proc. of the Conf. on Neural Information Processing Systems (NeurIPS), vol. 33, pp. 9700–9712, 2020.
- K. Fang, A. Toshev, L. Fei-Fei, and S. Savarese, “Scene memory transformer for embodied agents in long-horizon tasks,” in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2019, pp. 538–547.
- F. Schmalstieg, D. Honerkamp, T. Welschehold, and A. Valada, “Learning long-horizon robot exploration strategies for multi-object search in continuous action spaces,” Proceedings of the International Symposium on Robotics Research (ISRR), 2022.
- B. Yamauchi, “A frontier-based approach for autonomous exploration,” in Proc. of the IEEE Int. Symp. on Comput. Intell. in Rob. and Aut. (CIRA), 1997, pp. 146–151.
- D. S. Chaplot, D. Gandhi, S. Gupta, A. Gupta, and R. Salakhutdinov, “Learning to explore using active neural slam,” in International Conference on Learning Representations, 2020.
- C. Chen, U. Jain, C. Schissler, S. V. A. Gari, Z. Al-Halah, V. K. Ithapu, P. Robinson, and K. Grauman, “Soundspaces: Audio-visual navigation in 3d environments,” in Proc. of the Europ. Conf. on Computer Vision (ECCV), 2020, pp. 17–36.
- D. Honerkamp, T. Welschehold, and A. Valada, “N22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTm22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT: Learning navigation for arbitrary mobile manipulation motions in unseen and dynamic environments,” IEEE Transactions on Robotics, 2023.
- J. Gu, D. S. Chaplot, H. Su, and J. Malik, “Multi-skill mobile manipulation for object rearrangement,” in International Conference on Learning Representations, 2023.
- A. Szot, A. Clegg, E. Undersander, E. Wijmans, Y. Zhao, J. Turner, N. Maestre, M. Mukadam, D. S. Chaplot, O. Maksymets et al., “Habitat 2.0: Training home assistants to rearrange their habitat,” Proc. of the Conf. on Neural Information Processing Systems (NeurIPS), vol. 34, pp. 251–266, 2021.
- T. Chen, S. Gupta, and A. Gupta, “Learning exploration policies for navigation,” in International Conference on Learning Representations, 2019.
- A. Younes, D. Honerkamp, T. Welschehold, and A. Valada, “Catch me if you hear me: Audio-visual navigation in complex unmapped environments with moving sounds,” IEEE Rob. and Automation Letters, vol. 8, no. 2, pp. 928–935, 2023.
- Y. Qiu, A. Pal, and H. I. Christensen, “Learning hierarchical relationships for object-goal navigation,” in 2020 Conference on Robot Learning (CoRL), 2020.
- R. Druon, Y. Yoshiyasu, A. Kanezaki, and A. Watt, “Visual object search by learning spatial context,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1279–1286, 2020.
- E. Beeching, J. Debangoye, O. Simonin, and C. Wolf, “Deep reinforcement learning on a budget: 3d control and reasoning without a supercomputer,” in 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 158–165.
- F. Xia, W. B. Shen, C. Li, P. Kasimbeg, M. E. Tchapmi, A. Toshev, R. Martín-Martín, and S. Savarese, “Interactive gibson benchmark: A benchmark for interactive navigation in cluttered environments,” IEEE Rob. and Automation Letters, vol. 5, no. 2, pp. 713–720, 2020.
- S. K. Ramakrishnan, D. S. Chaplot, Z. Al-Halah, J. Malik, and K. Grauman, “Poni: Potential functions for objectgoal navigation with interaction-free learning,” in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2022, pp. 18 890–18 900.
- J. Kim, E. S. Lee, M. Lee, D. Zhang, and Y. M. Kim, “Sgolam: Simultaneous goal localization and mapping for multi-object goal navigation,” arXiv preprint arXiv:2110.07171, 2021.
- K. Zheng, R. Chitnis, Y. Sung, G. Konidaris, and S. Tellex, “Towards optimal correlational object search,” in Int. Conf. on Robotics & Automation. IEEE, 2022, pp. 7313–7319.
- K. Zheng, A. Paul, and S. Tellex, “A system for generalized 3d multi-object search,” Int. Conf. on Robotics & Automation, 2023.
- A. Röfer, G. Bartels, W. Burgard, A. Valada, and M. Beetz, “Kineverse: A symbolic articulation model framework for model-agnostic mobile manipulation,” IEEE Rob. and Automation Letters, vol. 7, no. 2, pp. 3372–3379, 2022.
- A. H. Qureshi and Y. Ayaz, “Intelligent bidirectional rapidly-exploring random trees for optimal motion planning in complex cluttered environments,” Robotics and Autonomous Systems, vol. 68, pp. 1–11, 2015.
- D. Honerkamp, T. Welschehold, and A. Valada, “Learning kinematic feasibility for mobile manipulation through deep reinforcement learning,” IEEE Rob. and Automation Letters, vol. 6, no. 4, pp. 6289–6296, 2021.
- O. Nachum, S. S. Gu, H. Lee, and S. Levine, “Data-efficient hierarchical reinforcement learning,” Advances in neural information processing systems, vol. 31, 2018.
- M. Hutsebaut-Buysse, K. Mets, and S. Latré, “Hierarchical reinforcement learning: A survey and open research challenges,” Machine Learning and Knowledge Extraction, vol. 4, no. 1, pp. 172–221, 2022.
- J. Krantz, A. Gokaslan, D. Batra, S. Lee, and O. Maksymets, “Waypoint models for instruction-guided navigation in continuous environments,” in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2021, pp. 15 162–15 171.
- C. Chen, S. Majumder, Z. Al-Halah, R. Gao, S. K. Ramakrishnan, and K. Grauman, “Learning to set waypoints for audio-visual navigation,” in International Conference on Learning Representations, 2020.
- N. Yokoyama, A. W. Clegg, E. Undersander, S. Ha, D. Batra, and A. Rai, “Adaptive skill coordination for robotic mobile manipulation,” arXiv preprint arXiv:2304.00410, 2023.
- M. Iovino, E. Scukins, J. Styrud, P. Ögren, and C. Smith, “A survey of behavior trees in robotics and ai,” Robotics and Autonomous Systems, vol. 154, p. 104096, 2022.
- G. Golluccio, D. Di Vito, A. Marino, and G. Antonelli, “Robotic weight-based object relocation in clutter via tree-based q-learning approach using breadth and depth search techniques,” in Int. Conf. on Advanced Robotics. IEEE, 2021, pp. 676–681.
- R. Mohan and A. Valada, “Amodal panoptic segmentation,” in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2022, pp. 21 023–21 032.
- S. Kohlbrecher, J. Meyer, O. von Stryk, and U. Klingauf, “A flexible and scalable slam system with full 3d motion estimation,” in Proc. IEEE International Symposium on Safety, Security and Rescue Robotics, 2011.
- S. Huang and S. Ontañón, “A closer look at invalid action masking in policy gradient algorithms,” in FLAIRS Conference, 2022.
- R. S. Sutton, D. Precup, and S. Singh, “Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning,” Artificial Intelligence, vol. 112, no. 1, pp. 181–211, 1999.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- C. Li, F. Xia, R. Martín-Martín, M. Lingelbach et al., “igibson 2.0: Object-centric simulation for robot learning of everyday household tasks,” in Conference on Robot Learning, vol. 164, 2021, pp. 455–465.
- P. Anderson, A. Chang, D. S. Chaplot, A. Dosovitskiy, S. Gupta, V. Koltun, J. Kosecka, J. Malik, R. Mottaghi et al., “On evaluation of embodied navigation agents,” arXiv preprint arXiv:1807.06757, 2018.
- T. Welschehold, C. Dornhege, and W. Burgard, “Learning mobile manipulation actions from human demonstrations,” in Int. Conf. on Intelligent Robots and Systems, 2017.
- K. Zhou, K. Zheng, C. Pryor, Y. Shen, H. Jin, L. Getoor, and X. E. Wang, “Esc: Exploration with soft commonsense constraints for zero-shot object navigation,” arXiv preprint arXiv:2301.13166, 2023.
- J. Chen, G. Li, S. Kumar, B. Ghanem, and F. Yu, “How to not train your dragon: Training-free embodied object goal navigation with semantic frontiers,” Robotics: Science and Systems, 2023.