Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scaling Is All You Need: Autonomous Driving with JAX-Accelerated Reinforcement Learning (2312.15122v4)

Published 23 Dec 2023 in cs.LG, cs.AI, and cs.RO

Abstract: Reinforcement learning has been demonstrated to outperform even the best humans in complex domains like video games. However, running reinforcement learning experiments on the required scale for autonomous driving is extremely difficult. Building a large scale reinforcement learning system and distributing it across many GPUs is challenging. Gathering experience during training on real world vehicles is prohibitive from a safety and scalability perspective. Therefore, an efficient and realistic driving simulator is required that uses a large amount of data from real-world driving. We bring these capabilities together and conduct large-scale reinforcement learning experiments for autonomous driving. We demonstrate that our policy performance improves with increasing scale. Our best performing policy reduces the failure rate by 64% while improving the rate of driving progress by 25% compared to the policies produced by state-of-the-art machine learning for autonomous driving.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Video pretraining (vpt): Learning to act by watching unlabeled online videos, 2022.
  2. Dota 2 with large scale deep reinforcement learning, 2019.
  3. Hierarchical model-based imitation learning for planning in autonomous driving, 2022a.
  4. Embedding synthetic off-policy experience for autonomous driving via zero-shot curricula, 2022b.
  5. Accelerating reinforcement learning through gpu atari emulation, 2020.
  6. Carla: An open urban driving simulator, 2017.
  7. Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures, 2018.
  8. Waymax: An accelerated, data-driven simulator for large-scale autonomous driving research, 2023.
  9. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, 2018.
  10. Scaling laws for single-agent reinforcement learning, 2023.
  11. Symphony: Learning realistic and diverse agents for autonomous driving simulation, 2022.
  12. Perceiver: General perception with iterative attention, 2021.
  13. Scaling laws for neural language models, 2020.
  14. Lange, R. T. gymnax: A JAX-based reinforcement learning environment library, 2022. URL http://github.com/RobertTLange/gymnax.
  15. Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning, 2022.
  16. Discovered policy optimisation. Advances in Neural Information Processing Systems, 35:16455–16468, 2022.
  17. Imitation is not enough: Robustifying imitation with reinforcement learning for challenging driving scenarios, 2023.
  18. Rajamani, R. Lateral Vehicle Dynamics, pp.  15–46. Springer US, Boston, MA, 2012. ISBN 978-1-4614-1433-9. doi: 10.1007/978-1-4614-1433-9˙2. URL https://doi.org/10.1007/978-1-4614-1433-9_2.
  19. Off-policy actor-critic with shared experience replay, 2019.
  20. Trust region policy optimization, 2017a.
  21. Proximal policy optimization algorithms, 2017b.
  22. Reinforcement learning: An introduction (2nd ed.). 2022.
  23. Attention is all you need, 2023.
  24. Nocturne: a scalable driving benchmark for bringing multi-agent learning one step closer to the real world, 2023.
  25. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 575(7782):350–354, 2019.
  26. Safetynet: Safe planning for real-world self-driving vehicles using machine-learned policies, 2021.
  27. Efficient reinforcement learning for autonomous driving with parameterized skills and priors, 2023.
  28. Learning realistic traffic agents in closed-loop, 2023.
  29. End-to-end urban driving by imitating a reinforcement learning coach, 2021.

Summary

We haven't generated a summary for this paper yet.