When to Replan? An Adaptive Replanning Strategy for Autonomous Navigation using Deep Reinforcement Learning (2304.12046v3)
Abstract: The hierarchy of global and local planners is one of the most commonly utilized system designs in autonomous robot navigation. While the global planner generates a reference path from the current to goal locations based on the pre-built map, the local planner produces a kinodynamic trajectory to follow the reference path while avoiding perceived obstacles. To account for unforeseen or dynamic obstacles not present on the pre-built map, ``when to replan'' the reference path is critical for the success of safe and efficient navigation. However, determining the ideal timing to execute replanning in such partially unknown environments still remains an open question. In this work, we first conduct an extensive simulation experiment to compare several common replanning strategies and confirm that effective strategies are highly dependent on the environment as well as the global and local planners. Based on this insight, we then derive a new adaptive replanning strategy based on deep reinforcement learning, which can learn from experience to decide appropriate replanning timings in the given environment and planning setups. Our experimental results show that the proposed replanner can perform on par or even better than the current best-performing strategies in multiple situations regarding navigation robustness and efficiency.
- O. Brock and O. Khatib, “High-speed navigation using the global dynamic window approach,” in International Conference on Robotics and Automation, vol. 1. IEEE, 1999, pp. 341–346.
- R. R. Murphy, A. Marzilli, and K. Hughes, “When to explicitly replan paths for mobile robots,” in International Conference on Robotics and Automation, vol. 4. IEEE, 1997, pp. 3519–3525.
- J. Tordesillas, B. T. Lopez, J. Carter, J. Ware, and J. P. How, “Real-time planning with multi-fidelity models for agile flights in unknown environments,” in International Conference on Robotics and Automation. IEEE, 2019, pp. 725–731.
- H. Zha, K. Tanaka, and T. Hasegawa, “Detecting changes in a dynamic environment for updating its maps by using a mobile robot,” in International Conference on Intelligent Robot and Systems, vol. 3. IEEE/RSJ, 1997, pp. 1729–1734.
- H. Oleynikova, Z. Taylor, R. Siegwart, and J. Nieto, “Safe local exploration for replanning in cluttered unknown environments for microaerial vehicles,” IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 1474–1481, 2018.
- S. Macenski, T. Moore, D. V. Lu, A. Merzlyakov, and M. Ferguson, “From the desks of ros maintainers: A survey of modern & capable mobile robotics algorithms in the robot operating system 2,” Robotics and Autonomous Systems, p. 104493, 2023.
- L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, “Planning and acting in partially observable stochastic domains,” Artificial intelligence, vol. 101, no. 1-2, pp. 99–134, 1998.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015.
- K. Ota, Y. Sasaki, D. K. Jha, Y. Yoshiyasu, and A. Kanezaki, “Efficient exploration in constrained environments with goal-oriented reference path,” in International Conference on Intelligent Robots and Systems. IEEE/RSJ, 2020, pp. 6061–6068.
- L. Kästner, X. Zhao, T. Buiyan, J. Li, Z. Shen, J. Lambrecht, and C. Marx, “Connecting deep-reinforcement-learning-based obstacle avoidance with conventional global planners using waypoint generators,” in International Conference on Intelligent Robots and Systems. IEEE/RSJ, 2021, pp. 1213–1220.
- L. Dong, Z. He, C. Song, and C. Sun, “A review of mobile robot motion planning methods: from classical motion planning workflows to reinforcement learning-based architectures,” arXiv preprint arXiv:2108.13619, 2021.
- A. Faust, K. Oslund, O. Ramirez, A. Francis, L. Tapia, M. Fiser, and J. Davidson, “PRM-RL: Long-range robotic navigation tasks by combining reinforcement learning and sampling-based planning,” in International Conference on Robotics and Automation. IEEE, 2018, pp. 5113–5120.
- H.-T. L. Chiang, J. Hsu, M. Fiser, L. Tapia, and A. Faust, “RL-RRT: Kinodynamic motion planning via learning reachability estimators from RL policies,” IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 4298–4305, 2019.
- A. Pokle, R. Martín-Martín, P. Goebel, V. Chow, H. M. Ewald, J. Yang, Z. Wang, A. Sadeghian, D. Sadigh, S. Savarese, et al., “Deep local trajectory replanning and control for robot navigation,” in International Conference on Robotics and Automation. IEEE, 2019, pp. 5815–5822.
- L. Kästner, T. Buiyan, L. Jiao, T. A. Le, X. Zhao, Z. Shen, and J. Lambrecht, “Arena-Rosnav: Towards deployment of deep-reinforcement-learning-based obstacle avoidance into conventional autonomous navigation systems,” in International Conference on Intelligent Robots and Systems. IEEE/RSJ, 2021, pp. 6456–6463.
- B. Angulo, A. Panov, and K. Yakovlev, “Policy optimization to learn adaptive motion primitives in path planning with dynamic obstacles,” IEEE Robotics and Automation Letters, vol. 8, no. 2, pp. 824–831, 2022.
- B. Wang, Z. Liu, Q. Li, and A. Prorok, “Mobile robot path planning in dynamic environments through globally guided reinforcement learning,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6932–6939, 2020.
- M. Likhachev, G. J. Gordon, and S. Thrun, “ARA*: Anytime A* with provable bounds on sub-optimality,” Advances in neural information processing systems, vol. 16, 2003.
- X. Xiao, Z. Xu, Z. Wang, Y. Song, G. Warnell, P. Stone, T. Zhang, S. Ravi, G. Wang, H. Karnan, et al., “Autonomous ground navigation in highly constrained spaces: Lessons learned from the benchmark autonomous robot navigation challenge at ICRA 2022,” Robotics & Automation Magazine, vol. 29, no. 4, pp. 148–156, 2022.
- F. Pardo, A. Tavakoli, V. Levdik, and P. Kormushev, “Time limits in reinforcement learning,” in International Conference on Machine Learning. PMLR, 2018, pp. 4045–4054.
- T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” in International Conference on Learning Representations, 2016.
- D. Helbing and P. Molnar, “Social force model for pedestrian dynamics,” Physical review E, vol. 51, no. 5, p. 4282, 1995.
- S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimal motion planning,” The international journal of robotics research, vol. 30, no. 7, pp. 846–894, 2011.
- L. E. Kavraki, P. Svestka, J.-C. Latombe, and M. H. Overmars, “Probabilistic roadmaps for path planning in high-dimensional configuration spaces,” Transactions on Robotics and Automation, vol. 12, no. 4, pp. 566–580, 1996.
- D. Fox, W. Burgard, and S. Thrun, “The dynamic window approach to collision avoidance,” IEEE Robotics & Automation Magazine, vol. 4, no. 1, pp. 23–33, 1997.
- A. Muraleedharan, H. Okuda, and T. Suzuki, “Real-time implementation of randomized model predictive control for autonomous driving,” IEEE Transactions on Intelligent Vehicles, vol. 7, no. 1, pp. 11–20, 2021.
- A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and N. Dormann, “Stable-baselines3: Reliable reinforcement learning implementations,” Journal of Machine Learning Research, vol. 22, no. 268, pp. 1–8, 2021.