Developing Path Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging (2409.05289v3)
Abstract: In autonomous driving, end-to-end methods utilizing Imitation Learning (IL) and Reinforcement Learning (RL) are becoming more and more common. However, they do not involve explicit reasoning like classic robotics workflow and planning with horizons, resulting in strategies implicit and myopic. In this paper, we introduce a path planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) for static obstacle nudging. It outputs lateral offset values to adjust the given reference waypoints and performs modified path for different controllers. Experimental results show that the algorithm can do path following that mimics the expert performance of path-tracking controllers, and avoid collision to fixed obstacles. The method makes a good attempt at planning with learning-based methods in path planning problems of autonomous driving.
- R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” Advances in neural information processing systems, vol. 12, 1999.
- J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International conference on machine learning. PMLR, 2015, pp. 1889–1897.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- A. Hussein, M. M. Gaber, E. Elyan, and C. Jayne, “Imitation learning: A survey of learning methods,” ACM Computing Surveys (CSUR), vol. 50, no. 2, pp. 1–35, 2017.
- S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011, pp. 627–635.
- M. Kelly, C. Sidrane, K. Driggs-Campbell, and M. J. Kochenderfer, “Hg-dagger: Interactive imitation learning with human experts,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 8077–8083.
- J. Spencer, S. Choudhury, M. Barnes, M. Schmittle, M. Chiang, P. Ramadge, and S. Srinivasa, “Expert intervention learning,” Autonomous Robots, vol. 46, no. 1, pp. 99–113, 2022.
- X. Sun, Y. Wu, S. Bhattacharya, and V. Kumar, “Multi-agent exploration of an unknown sparse landmark complex via deep reinforcement learning,” 2022. [Online]. Available: https://arxiv.org/abs/2209.11794
- H. Liu, Y. Shen, W. Zhou, Y. Zou, C. Zhou, and S. He, “Adaptive speed planning for unmanned vehicle based on deep reinforcement learning,” 2024. [Online]. Available: https://arxiv.org/abs/2404.17379
- Y. Zhang, K. Mo, F. Shen, X. Xu, X. Zhang, J. Yu, and C. Yu, “Self-adaptive robust motion planning for high dof robot manipulator using deep mpc,” arXiv preprint arXiv:2407.12887, 2024.
- Y. Pan, C.-A. Cheng, K. Saigol, K. Lee, X. Yan, E. Theodorou, and B. Boots, “Agile autonomous driving using end-to-end deep imitation learning,” in Robotics: Science and Systems XIV. Robotics: Science and Systems Foundation, June 2018. [Online]. Available: https://doi.org/10.15607/rss.2018.xiv.056
- P. Cai, H. Wang, H. Huang, Y. Liu, and M. Liu, “Vision-based autonomous car racing using deep imitative reinforcement learning,” IEEE Robotics and Automation Letters, pp. 1–1, 2021.
- M. OKelly, H. Zheng, D. Karthik, and R. Mangharam, “F1tenth: An open-source evaluation environment for continuous control and reinforcement learning,” in Proceedings of the NeurIPS 2019 Competition and Demonstration Track, ser. Proceedings of Machine Learning Research, vol. 123. PMLR, 2020, pp. 77–89. [Online]. Available: http://proceedings.mlr.press/v123/o-kelly20a.html
- J. Becker, N. Imholz, L. Schwarzenbach, E. Ghignone, N. Baumann, and M. Magno, “Model- and acceleration-based pursuit controller for high-performance autonomous racing,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 5276–5283.
- M. Bosello, R. Tse, and G. Pau, “Train in austria, race in montecarlo: Generalized rl for cross-track f1 tenth lidar-based races,” in 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC). IEEE, 2022, pp. 290–298.
- X. Sun, S. Yang, M. Zhou, K. Liu, and R. Mangharam, “Mega-dagger: Imitation learning with multiple imperfect experts,” 2024. [Online]. Available: https://arxiv.org/abs/2303.00638
- J. Betz, H. Zheng, A. Liniger, U. Rosolia, P. Karle, M. Behl, V. Krovi, and R. Mangharam, “Autonomous vehicles on the edge: A survey on autonomous vehicle racing,” IEEE Open Journal of Intelligent Transportation Systems, 2022.
- Z. Qiao, M. Zhou, Z. Zhuang, T. Agarwal, F. Jahncke, P.-J. Wang, J. Friedman, H. Lai, D. Sahu, T. Nagy, M. Endler, J. Schlessman, and R. Mangharam, “Av4ev: Open-source modular autonomous electric vehicle platform for making mobility research accessible,” 2024. [Online]. Available: https://arxiv.org/abs/2312.00951
- Z. Zhuang, “f1tenth_rl,” https://github.com/zzjun725/f1tenth˙rl, 2024.
- X. Sun, M. Zhou, Z. Zhuang, S. Yang, J. Betz, and R. Mangharam, “A benchmark comparison of imitation learning-based control policies for autonomous racing,” in 2023 IEEE Intelligent Vehicles Symposium (IV), 2023, pp. 1–5.
- V. Sezer and M. Gokasan, “A novel obstacle avoidance algorithm: “follow the gap method”,” Robotics and Autonomous Systems, vol. 60, no. 9, pp. 1123–1134, 2012. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0921889012000838
- T. Weiss and M. Behl, “Deepracing: Parameterized trajectories for autonomous racing,” arXiv preprint arXiv:2005.05178, 2020.
- L. Lipeng, L. Xu, J. Liu, H. Zhao, T. Jiang, and T. Zheng, “Prioritized experience replay-based ddqn for unmanned vehicle path planning,” arXiv preprint arXiv:2406.17286, 2024.
- J. M. Snider et al., “Automatic steering methods for autonomous automobile path tracking,” Robotics Institute, Pittsburgh, PA, Tech. Rep. CMU-RITR-09-08, 2009.
- A. Heilmeier, A. Wischnewski, L. Hermansdorfer, J. Betz, M. Lienkamp, and B. Lohmann, “Minimum curvature trajectory planning and control for an autonomous race car,” Vehicle System Dynamics, vol. 58, no. 10, pp. 1497–1527, 2020. [Online]. Available: https://doi.org/10.1080/00423114.2019.1631455
- S. Huang, R. F. J. Dossa, C. Ye, J. Braga, D. Chakraborty, K. Mehta, and J. G. Araújo, “Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms,” Journal of Machine Learning Research, vol. 23, no. 274, pp. 1–18, 2022. [Online]. Available: http://jmlr.org/papers/v23/21-1342.html