Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning (2403.19918v3)

Published 29 Mar 2024 in cs.RO, cs.AI, and cs.LG

Abstract: Evaluating autonomous vehicle stacks (AVs) in simulation typically involves replaying driving logs from real-world recorded traffic. However, agents replayed from offline data are not reactive and hard to intuitively control. Existing approaches address these challenges by proposing methods that rely on heuristics or generative models of real-world data but these approaches either lack realism or necessitate costly iterative sampling procedures to control the generated behaviours. In this work, we take an alternative approach and propose CtRL-Sim, a method that leverages return-conditioned offline reinforcement learning (RL) to efficiently generate reactive and controllable traffic agents. Specifically, we process real-world driving data through a physics-enhanced Nocturne simulator to generate a diverse offline RL dataset, annotated with various rewards. With this dataset, we train a return-conditioned multi-agent behaviour model that allows for fine-grained manipulation of agent behaviours by modifying the desired returns for the various reward components. This capability enables the generation of a wide range of driving behaviours beyond the scope of the initial dataset, including adversarial behaviours. We show that CtRL-Sim can generate realistic safety-critical scenarios while providing fine-grained control over agent behaviours.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Generating adversarial driving scenarios in high-fidelity simulators. In International Conference on Robotics and Automation (ICRA), 2019.
  2. Simnet: Learning reactive self-driving simulations from real-world observations. In Proceedings of the International Conference on Robotics and Automation (ICRA), 2021.
  3. Multi-agent imitation learning for driving simulation. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018.
  4. nuplan: A closed-loop ml-based planning benchmark for autonomous vehicles. arXiv preprint arXiv.2106.11810, 2021.
  5. Implicit latent variable model for scene-consistent motion forecasting. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
  6. Editing driver character: Socially-controllable behavior generation for interactive traffic simulation. IEEE Robotics Autom. Lett., 2023.
  7. Controllable safety-critical closed-loop traffic simulation via guided diffusion. arXiv preprint abs:2401.00391, 2024.
  8. Decision transformer: Reinforcement learning via sequence modeling. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
  9. Scept: Scene-consistent, policy-based trajectory predictions for planning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  10. Realgen: Retrieval augmented generation for controllable traffic scenarios. arXiv preprint abs:2312.13303, 2023.
  11. Rvs: What is essential for offline rl via supervised learning? ArXiv, abs/2112.10751, 2021. URL https://api.semanticscholar.org/CorpusID:245334837.
  12. Large scale interactive motion forecasting for autonomous driving : The waymo open motion dataset. In Proceedings of the IEEE/CVF International Conference on Computer Vision, (ICCV), 2021.
  13. Latent variable sequential set transformers for joint multi-agent motion prediction. In International Conference on Learning Representations, 2021. URL https://api.semanticscholar.org/CorpusID:246824069.
  14. Language decision transformers with exponential tilt for interactive text environments. arXiv preprint abs:2302.05507, 2023.
  15. Scenedm: Scene-level multi-agent trajectory generation with consistent diffusion models. arXiv preprint: abs:2311.15736, 2023.
  16. Generative adversarial imitation learning. In Advances in Neural Information Processing Systems (NeurIPS), 2016.
  17. Simple and scalable strategies to continually pre-train large language models. arXiv preprint arXiv.2403.08763, 2024.
  18. Symphony: Learning realistic and diverse agents for autonomous driving simulation. In Proceedings of the International Conference on Robotics and Automation (ICRA), 2022.
  19. Motiondiffuser: Controllable multi-agent motion prediction using diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  20. Predictionnet: Real-time joint probabilistic traffic prediction for planning, control, and simulation. In Proceedings of the International Conference on Robotics and Automation (ICRA), 2022.
  21. General lane-changing model mobil for car-following models. Transportation Research Record, 1999(1):86–94, 2007.
  22. Reward-conditioned policies. arXiv preprint arXiv:1912.13465, 2019.
  23. Comparison of waymo rider-only crash data to human benchmarks at 7.1 million miles. arXiv preprint abs:2312.12675, 2023.
  24. Set transformer: A framework for attention-based permutation-invariant neural networks. In Proceedings of the International Conference on Machine Learning (ICML), 2019.
  25. Multi-game decision transformers. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  26. Imitation is not enough: Robustifying imitation with reinforcement learning for challenging driving scenarios. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023.
  27. Offline pre-trained multi-agent decision transformer: One big sequence model tackles all smac tasks. ArXiv, abs/2112.02845, 2021. URL https://api.semanticscholar.org/CorpusID:245335360.
  28. Scene transformer: A unified architecture for predicting future trajectories of multiple agents. In International Conference on Learning Representations (ICLR), 2022.
  29. Trajeglish: Learning the language of driving scenarios. arXiv preprint abs:2312.04535, 2023.
  30. A probabilistic perspective on reinforcement learning via supervised learning. In ICLR 2022 Workshop on Generalizable Policy Learning in Physical World, 2022.
  31. Generating useful accident-prone driving scenarios via a learned traffic prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  32. FJMP: factorized joint multi-agent motion prediction over learned directed acyclic interaction graphs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  33. Imagining the road ahead: Multi-agent trajectory prediction via differentiable simulation. In Proceedings of the IEEE International Intelligent Transportation Systems Conference (ITSC), 2021.
  34. Motionlm: Multi-agent motion forecasting as language modeling. In IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  35. Training agents using upside-down reinforcement learning. ArXiv, abs/1912.02877, 2019. URL https://api.semanticscholar.org/CorpusID:208857468.
  36. Trafficsim: Learning to simulate realistic multi-agent behaviors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  37. MIXSIM: A hierarchical framework for mixed reality traffic simulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  38. Congested traffic states in empirical observations and microscopic simulations. Physical review E, 62(2):1805, 2000.
  39. Attention is all you need. In Advances in Neural Information Processing Systems (NeurIPS), 2017.
  40. Nocturne: a scalable driving benchmark for bringing multi-agent learning one step closer to the real world. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  41. Advsim: Generating safety-critical scenarios for self-driving vehicles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  42. Multiverse transformer: 1st place solution for waymo open sim agents challenge 2023. arXiv preprint abs:2306.11868, 2023.
  43. Diffscene: Diffusion-based safety-critical scenario generation for autonomous vehicles. In The Second Workshop on New Frontiers in Adversarial Machine Learning, 2023a.
  44. BITS: bi-level imitation for traffic simulation. In Proceedings of the International Conference on Robotics and Automation (ICRA), 2023b.
  45. Learning realistic traffic agents in closed-loop. In Conference on Robot Learning (CoRL), 2023a.
  46. Trafficbots: Towards world models for autonomous driving simulation and motion prediction. In Proceedings of the International Conference on Robotics and Automation (ICRA), 2023b.
  47. Objective-aware traffic simulation via inverse reinforcement learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2021.
  48. Language-guided traffic simulation via scene-level diffusion. arXiv preprint arXiv.2306.06344, 2023a.
  49. Guided conditional diffusion for controllable traffic simulation. In Proceedings of the International Conference on Robotics and Automation (ICRA), 2023b.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Luke Rowe (8 papers)
  2. Roger Girgis (8 papers)
  3. Anthony Gosselin (5 papers)
  4. Bruno Carrez (2 papers)
  5. Florian Golemo (21 papers)
  6. Felix Heide (72 papers)
  7. Liam Paull (47 papers)
  8. Christopher Pal (97 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.