Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2) (2402.16720v2)

Published 26 Feb 2024 in cs.RO

Abstract: Real-world autonomous driving (AD) especially urban driving involves many corner cases. The lately released AD simulator CARLA v2 adds 39 common events in the driving scene, and provide more quasi-realistic testbed compared to CARLA v1. It poses new challenge to the community and so far no literature has reported any success on the new scenarios in V2 as existing works mostly have to rely on specific rules for planning yet they cannot cover the more complex cases in CARLA v2. In this work, we take the initiative of directly training a planner and the hope is to handle the corner cases flexibly and effectively, which we believe is also the future of AD. To our best knowledge, we develop the first model-based RL method named Think2Drive for AD, with a world model to learn the transitions of the environment, and then it acts as a neural simulator to train the planner. This paradigm significantly boosts the training efficiency due to the low dimensional state space and parallel computing of tensors in the world model. As a result, Think2Drive is able to run in an expert-level proficiency in CARLA v2 within 3 days of training on a single A6000 GPU, and to our best knowledge, so far there is no reported success (100\% route completion)on CARLA v2. We also propose CornerCase-Repository, a benchmark that supports the evaluation of driving models by scenarios. Additionally, we propose a new and balanced metric to evaluate the performance by route completion, infraction number, and scenario density, so that the driving score could give more information about the actual driving performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41–48, 2009.
  2. CARLA. Carla autonomous driving leaderboard, 2022. https://leaderboard.carla.org/.
  3. Gri: General reinforced imitation and its application to vision-based autonomous driving. Robotics, 12(5):127, 2023.
  4. Learning by cheating. Conference on Robot Learning,Conference on Robot Learning, 2019a.
  5. Model-free deep reinforcement learning for urban autonomous driving. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019b.
  6. End-to-end driving via conditional imitation learning. In 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018.
  7. Exploring the limitations of behavior cloning for autonomous driving. arXiv: Computer Vision and Pattern Recognition,arXiv: Computer Vision and Pattern Recognition, 2019.
  8. Carla: An open urban driving simulator. In Conference on robot learning, pages 1–16. PMLR, 2017a.
  9. Carla: An open urban driving simulator. Conference on Robot Learning,Conference on Robot Learning, 2017b.
  10. Addressing function approximation error in actor-critic methods. arXiv: Artificial Intelligence,arXiv: Artificial Intelligence, 2018.
  11. World models. arXiv preprint arXiv:1803.10122, 2018.
  12. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv: Learning,arXiv: Learning, 2018.
  13. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603, 2019a.
  14. Learning latent dynamics for planning from pixels. In International conference on machine learning, pages 2555–2565. PMLR, 2019b.
  15. Learning latent dynamics for planning from pixels. In Proceedings of the 36th International Conference on Machine Learning, pages 2555–2565. PMLR, 2019c.
  16. Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193, 2020.
  17. Mastering diverse domains through world models, 2023.
  18. Planning-oriented autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  19. Imitation learning: A survey of learning methods. ACM Computing Surveys (CSUR), 50(2):1–35, 2017.
  20. Hidden biases of end-to-end driving models. 2023.
  21. Ide-net: Interactive driving event and pattern extraction from human data. IEEE Robotics and Automation Letters, 6(2):3065–3072, 2021.
  22. Hdgt: Heterogeneous driving graph transformer for multi-agent trajectory prediction via scene encoding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45:13860–13875, 2022.
  23. Towards capturing the temporal dynamics for trajectory prediction: a coarse-to-fine approach. In Proceedings of The 6th Conference on Robot Learning, pages 910–920. PMLR, 2023a.
  24. Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7953–7963, 2023b.
  25. Think twice before driving: Towards scalable decoders for end-to-end autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21983–21994, 2023c.
  26. Edouard Leurent. An environment for autonomous driving decision-making. https://github.com/eleurent/highway-env, 2018.
  27. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
  28. The primacy bias in deep reinforcement learning. In International conference on machine learning, pages 16828–16847. PMLR, 2022.
  29. Deep imitative models for flexible inference, planning, and control. Computer Vision and Pattern Recognition,Computer Vision and Pattern Recognition, 2018.
  30. Mastering atari, go, chess and shogi by planning with a learned model. Nature, 588(7839):604–609, 2020.
  31. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  32. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12, 1999.
  33. Congested traffic states in empirical observations and microscopic simulations. Physical review E, 62(2):1805, 2000.
  34. Deep reinforcement learning with double q-learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
  35. Trajectory-guided control prediction for end-to-end autonomous driving: A simple yet strong baseline. Advances in Neural Information Processing Systems, 35:6119–6132, 2022.
  36. Policy pre-training for autonomous driving via self-supervised geometric modeling. In International Conference on Learning Representations, 2023a.
  37. Daydreamer: World models for physical robot learning. In Conference on Robot Learning, pages 2226–2240. PMLR, 2023b.
  38. End-to-end urban driving by imitating a reinforcement learning coach. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
Citations (18)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com