TC-Driver: Trajectory Conditioned Driving for Robust Autonomous Racing -- A Reinforcement Learning Approach (2205.09370v2)
Abstract: Autonomous racing is becoming popular for academic and industry researchers as a test for general autonomous driving by pushing perception, planning, and control algorithms to their limits. While traditional control methods such as MPC are capable of generating an optimal control sequence at the edge of the vehicles physical controllability, these methods are sensitive to the accuracy of the modeling parameters. This paper presents TC-Driver, a RL approach for robust control in autonomous racing. In particular, the TC-Driver agent is conditioned by a trajectory generated by any arbitrary traditional high-level planner. The proposed TC-Driver addresses the tire parameter modeling inaccuracies by exploiting the heuristic nature of RL while leveraging the reliability of traditional planning methods in a hierarchical control structure. We train the agent under varying tire conditions, allowing it to generalize to different model parameters, aiming to increase the racing capabilities of the system in practice. The proposed RL method outperforms a non-learning-based MPC with a 2.7 lower crash ratio in a model mismatch setting, underlining robustness to parameter discrepancies. In addition, the average RL inference duration is 0.25 ms compared to the average MPC solving time of 11.5 ms, yielding a nearly 40-fold speedup, allowing for complex control deployment in computationally constrained devices. Lastly, we show that the frequently utilized end-to-end RL architecture, as a control policy directly learned from sensory input, is not well suited to model mismatch robustness nor track generalization. Our realistic simulations show that TC-Driver achieves a 6.7 and 3-fold lower crash ratio under model mismatch and track generalization settings, while simultaneously achieving lower lap times than an end-to-end approach, demonstrating the viability of TC-driver to robust autonomous racing.
- J. Kabzan, M. de la Iglesia Valls, V. Reijgwart, H. F. C. Hendrikx, C. Ehmke, M. Prajapat, A. Bühler, N. B. Gosala, M. Gupta, R. Sivanesan, A. Dhall, E. Chisari, N. Karnchanachari, S. Brits, M. Dangel, I. Sa, R. Dubé, A. Gawel, M. Pfeiffer, A. Liniger, J. Lygeros, and R. Siegwart, “AMZ driverless: The full autonomous racing system,” CoRR, vol. abs/1905.05150, 2019. [Online]. Available: http://arxiv.org/abs/1905.05150
- C. K. Law, D. Dalal, and S. Shearrow, “Robust model predictive control for autonomous vehicles/self driving cars,” 2018. [Online]. Available: https://arxiv.org/abs/1805.08551
- U. Rosolia and F. Borrelli, “Learning how to autonomously race a car: a predictive control approach,” 2019. [Online]. Available: https://arxiv.org/abs/1901.08184
- T. J. Crayton and B. M. Meier, “Autonomous vehicles: Developing a public health research agenda to frame the future of transportation policy,” Journal of Transport & Health, vol. 6, pp. 245–252, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2214140517300014
- E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A survey of autonomous driving: <italic>common practices and emerging technologies</italic>,” IEEE Access, vol. 8, pp. 58 443–58 469, 2020.
- A. Liniger, A. Domahidi, and M. Morari, “Optimization-based autonomous racing of 1:43 scale rc cars,” Optimal Control Applications and Methods, vol. 36, no. 5, p. 628–647, Jul 2014. [Online]. Available: http://dx.doi.org/10.1002/oca.2123
- M. O’Kelly, H. Zheng, D. Karthik, and R. Mangharam, “F1tenth: An open-source evaluation environment for continuous control and reinforcement learning,” in NeurIPS 2019 Competition and Demonstration Track. PMLR, 2020, pp. 77–89.
- A. Liniger, “Pushing the limits of friction: A story of model mismatch,” 2021, iCRA21 Autonomous Racing. [Online]. Available: https://www.youtube.com/watch?v=_rTawyZghEg&t=136s
- L. P. Fröhlich, C. Küttel, E. Arcari, L. Hewing, M. N. Zeilinger, and A. Carron, “Model learning and contextual controller tuning for autonomous racing,” 2021.
- A. Jain, M. O’Kelly, P. Chaudhari, and M. Morari, “BayesRace: Learning to race autonomously using prior experience,” arXiv:2005.04755 [cs, eess], Nov. 2020, arXiv: 2005.04755. [Online]. Available: http://arxiv.org/abs/2005.04755
- E. Chisari, A. Liniger, A. Rupenyan, L. V. Gool, and J. Lygeros, “Learning from simulation, racing in reality,” 2021.
- A. Brunnbauer, L. Berducci, A. Brandstätter, M. Lechner, R. Hasani, D. Rus, and R. Grosu, “Model-based versus Model-free Deep Reinforcement Learning for Autonomous Racing Cars,” arXiv:2103.04909 [cs], Mar. 2021, arXiv: 2103.04909. [Online]. Available: http://arxiv.org/abs/2103.04909
- F. Fuchs, Y. Song, E. Kaufmann, D. Scaramuzza, and P. Durr, “Super-human performance in gran turismo sport using deep reinforcement learning,” IEEE Robotics and Automation Letters, vol. 6, no. 3, p. 4257–4264, Jul 2021. [Online]. Available: http://dx.doi.org/10.1109/LRA.2021.3064284
- P. R. Wurman, S. Barrett, K. Kawamoto, J. MacGlashan, K. Subramanian, T. J. Walsh, R. Capobianco, A. Devlic, F. Eckert, F. Fuchs, L. Gilpin, P. Khandelwal, V. Kompella, H. Lin, P. MacAlpine, D. Oller, T. Seno, C. Sherstan, M. D. Thomure, H. Aghabozorgi, L. Barrett, R. Douglas, D. Whitehead, P. Dürr, P. Stone, M. Spranger, and H. Kitano, “Outracing champion gran turismo drivers with deep reinforcement learning,” Nature, vol. 602, no. 7896, pp. 223–228, Feb. 2022. [Online]. Available: https://doi.org/10.1038/s41586-021-04357-7
- J. Betz, H. Zheng, A. Liniger, U. Rosolia, P. Karle, M. Behl, V. Krovi, and R. Mangharam, “Autonomous vehicles on the edge: A survey on autonomous vehicle racing.” [Online]. Available: https://arxiv.org/abs/2202.07008
- M. Werling, J. Ziegler, S. Kammel, and S. Thrun, “Optimal trajectory generation for dynamic street scenarios in a frenét frame,” in 2010 IEEE International Conference on Robotics and Automation, 2010, pp. 987–993.
- Y. Song, H. Lin, E. Kaufmann, P. Duerr, and D. Scaramuzza, “Autonomous Overtaking in Gran Turismo Sport Using Curriculum Reinforcement Learning,” arXiv:2103.14666 [cs], May 2021, arXiv: 2103.14666. [Online]. Available: http://arxiv.org/abs/2103.14666
- G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” 2016. [Online]. Available: https://arxiv.org/abs/1606.01540
- M. Althoff, M. Koschi, and S. Manzinger, “CommonRoad: Composable benchmarks for motion planning on roads,” in 2017 IEEE Intelligent Vehicles Symposium (IV). Los Angeles, CA, USA: IEEE, Jun. 2017, pp. 719–726. [Online]. Available: http://ieeexplore.ieee.org/document/7995802/
- P. Polack, F. Altché, B. d’Andréa Novel, and A. de La Fortelle, “The kinematic bicycle model: A consistent model for planning feasible trajectories for autonomous vehicles?” in 2017 IEEE Intelligent Vehicles Symposium (IV), 2017, pp. 812–818.
- T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” CoRR, vol. abs/1801.01290, 2018. [Online]. Available: http://arxiv.org/abs/1801.01290
- A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and N. Dormann, “Stable-baselines3: Reliable reinforcement learning implementations,” Journal of Machine Learning Research, vol. 22, no. 268, pp. 1–8, 2021. [Online]. Available: http://jmlr.org/papers/v22/20-1364.html