Where to go next: Learning a Subgoal Recommendation Policy for Navigation Among Pedestrians (2102.13073v2)

Published 25 Feb 2021 in cs.RO and cs.LG

Abstract: Robotic navigation in environments shared with other robots or humans remains challenging because the intentions of the surrounding agents are not directly observable and the environment conditions are continuously changing. Local trajectory optimization methods, such as model predictive control (MPC), can deal with those changes but require global guidance, which is not trivial to obtain in crowded scenarios. This paper proposes to learn, via deep Reinforcement Learning (RL), an interaction-aware policy that provides long-term guidance to the local planner. In particular, in simulations with cooperative and non-cooperative agents, we train a deep network to recommend a subgoal for the MPC planner. The recommended subgoal is expected to help the robot in making progress towards its goal and accounts for the expected interaction with other agents. Based on the recommended subgoal, the MPC planner then optimizes the inputs for the robot satisfying its kinodynamic and collision avoidance constraints. Our approach is shown to substantially improve the navigation performance in terms of number of collisions as compared to prior MPC frameworks, and in terms of both travel time and number of collisions compared to deep RL methods in cooperative, competitive and mixed multiagent scenarios.

Citations (53)

View on Semantic Scholar

Summary

The paper introduces a novel subgoal recommendation policy that integrates deep reinforcement learning with MPC for dynamic navigation.
It embeds learned global guidance into MPC to ensure collision avoidance and maintain dynamic feasibility during trajectory optimization.
Simulation results demonstrate significant improvements in travel time and collision reduction, validating the method's real-world potential.

Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments

Navigating dynamic environments is an increasingly critical challenge for autonomous robots, particularly due to the unpredictable behaviors of surrounding agents such as humans and other robots. The paper addresses this issue by proposing a novel technique that merges deep reinforcement learning (RL) with model predictive control (MPC) to enhance robot navigation performance.

Summary of the Proposed Approach

The core innovation presented in the paper is the development of a subgoal recommendation policy using deep RL. This policy aims to guide the local trajectory optimization process handled by an MPC framework. Through simulations incorporating cooperative and non-cooperative agent models, the authors trained a deep network to recommend subgoals that facilitate the robot's progress toward its ultimate goal while considering interactive dynamics with other agents.

The main contributions of the paper are:

Goal-Oriented MPC: The integration of a learned global guidance policy into the cost function of MPC. This policy recommends subgoals that ensure dynamic feasibility and collision avoidance constraints are maintained.
Joint Training Algorithm: An RL agent is trained jointly with an optimization-based controller, which proves applicable to real hardware scenarios and minimizes the simulation-to-reality gap.

Key Results

The proposed method significantly improved navigation performance relative to prior MPC frameworks. It demonstrated robust performance across metrics such as travel time and collision frequency in scenarios with both cooperative and non-cooperative surroundings. The simulation results highlight different navigation behaviors, such as traversing through crowds with cooperative agents and avoiding congested areas with non-cooperative agents.

Implications and Future Directions

This approach offers practical and theoretical implications for navigation in dynamic environments. On a practical level, it supports robust real-time decision-making capabilities in crowded situations, which is essential for autonomous vehicles and social robots. Theoretically, it enhances our understanding of hybrid approaches that leverage both prediction and reactive control, potentially unlocking new avenues in reinforcement learning applications.

Future work might focus on refining the simulation-to-real-world applicability by including more complex agent interactions and unforeseen events. Additionally, exploring broader applications of the GO-MPC framework, such as collaborative tasks involving multiple robots or integration with other sophisticated AI systems, could yield further advancements in autonomous navigation.

In summary, this paper provides a compelling approach to dynamic environment navigation by effectively integrating deep learning and model-based control. It not only demonstrates improved performance over existing methods but also sets a foundation for ongoing research and development in autonomous robotics.

PDF Markdown

Related Papers

GitHub

GitHub - tud-amr/go-mpc: This repository will contain the code used for the IEEE RA-L + ICRA 2021 publication: "Where to go next: Learning a Subgoal Recommendation Policy for Navigation Among Pedestrians" (70 stars)