Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DragTraffic: Interactive and Controllable Traffic Scene Generation for Autonomous Driving (2404.12624v2)

Published 19 Apr 2024 in cs.RO and cs.CV

Abstract: Evaluating and training autonomous driving systems require diverse and scalable corner cases. However, most existing scene generation methods lack controllability, accuracy, and versatility, resulting in unsatisfactory generation results. Inspired by DragGAN in image generation, we propose DragTraffic, a generalized, interactive, and controllable traffic scene generation framework based on conditional diffusion. DragTraffic enables non-experts to generate a variety of realistic driving scenarios for different types of traffic agents through an adaptive mixture expert architecture. We employ a regression model to provide a general initial solution and a refinement process based on the conditional diffusion model to ensure diversity. User-customized context is introduced through cross-attention to ensure high controllability. Experiments on a real-world driving dataset show that DragTraffic outperforms existing methods in terms of authenticity, diversity, and freedom. Demo videos and code are available at https://chantsss.github.io/Dragtraffic/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. P. A. Lopez, M. Behrisch, L. Bieker-Walz, J. Erdmann, Y.-P. Flötteröd, R. Hilbrich, L. Lücken, J. Rummel, P. Wagner, and E. Wiessner, “Microscopic traffic simulation using sumo,” in 2018 21st International Conference on Intelligent Transportation Systems (ITSC), 2018, pp. 2575–2582.
  2. A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An open urban driving simulator,” in Proceedings of the 1st Annual Conference on Robot Learning, 2017, pp. 1–16.
  3. D. Chen, M. Zhu, H. Yang, X. Wang, and Y. Wang, “Data-driven traffic simulation: A comprehensive review,” 2023.
  4. N. Montali, J. Lambert, P. Mougin, A. Kuefler, N. Rhinehart, M. Li, C. Gulino, T. Emrich, Z. Yang, S. Whiteson, et al., “The waymo open sim agents challenge,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  5. J. Philion, X. B. Peng, and S. Fidler, “Trajeglish: Learning the language of driving scenarios,” arXiv preprint arXiv:2312.04535, 2023.
  6. Z. Zhou, Z. Wen, J. Wang, Y.-H. Li, and Y.-K. Huang, “Qcnext: A next-generation framework for joint multi-agent trajectory prediction,” arXiv preprint arXiv:2306.10508, 2023.
  7. S. Shi, L. Jiang, D. Dai, and B. Schiele, “Mtr++: Multi-agent motion prediction with symmetric scene modeling and guided intention querying,” 2023.
  8. S. Tan, K. Wong, S. Wang, S. Manivasagam, M. Ren, and R. Urtasun, “Scenegen: Learning to generate realistic traffic scenes,” 2021.
  9. L. Feng, Q. Li, Z. Peng, S. Tan, and B. Zhou, “Trafficgen: Learning to generate diverse and realistic traffic scenarios,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 3567–3575.
  10. H. Zhang, H. Song, S. Li, M. Zhou, and D. Song, “A survey of controllable text generation using transformer-based pre-trained language models,” ACM Computing Surveys, vol. 56, no. 3, pp. 1–37, 2023.
  11. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” 2022.
  12. A. R. et al, “Hierarchical text-conditional image generation with clip latents,” 2022.
  13. C. Meng, Y. He, Y. Song, J. Song, J. Wu, J.-Y. Zhu, and S. Ermon, “Sdedit: Guided image synthesis and editing with stochastic differential equations,” 2022.
  14. W. Peebles and S. Xie, “Scalable diffusion models with transformers,” 2023.
  15. W. Ding, B. Chen, B. Li, K. J. Eun, and D. Zhao, “Multimodal safety-critical scenarios generation for decision-making algorithms evaluation,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1551–1558, 2021.
  16. Z.-H. Yin, L. Sun, L. Sun, M. Tomizuka, and W. Zhan, “Diverse critical interaction generation for planning and planner evaluation,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2021, pp. 7036–7043.
  17. C. Jiang, A. Cornman, C. Park, B. Sapp, Y. Zhou, D. Anguelov, et al., “Motiondiffuser: Controllable multi-agent motion prediction using diffusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 9644–9653.
  18. Z. Guo, X. Gao, J. Zhou, X. Cai, and B. Shi, “Scenedm: Scene-level multi-agent trajectory generation with consistent diffusion models,” arXiv preprint arXiv:2311.15736, 2023.
  19. Z. Zhong, D. Rempe, Y. Chen, B. Ivanovic, Y. Cao, D. Xu, M. Pavone, and B. Ray, “Language-guided traffic simulation via scene-level diffusion,” in Conference on Robot Learning.   PMLR, 2023, pp. 144–177.
  20. X. Pan, A. Tewari, T. Leimkühler, L. Liu, A. Meka, and C. Theobalt, “Drag your gan: Interactive point-based manipulation on the generative image manifold,” 2023.
  21. B. Varadarajan, A. Hefny, A. Srivastava, K. S. Refaat, N. Nayakanti, A. Cornman, K. Chen, B. Douillard, C. Lam, D. Anguelov, and B. Sapp, “Multipath++: Efficient information fusion and trajectory aggregation for behavior prediction,” CoRR, vol. abs/2111.14973, 2021. [Online]. Available: https://arxiv.org/abs/2111.14973
  22. J. Gu, C. Sun, and H. Zhao, “Densetnt: End-to-end trajectory prediction from dense goal sets,” CoRR, vol. abs/2108.09640, 2021. [Online]. Available: https://arxiv.org/abs/2108.09640
  23. N. Nayakanti, R. Al-Rfou, A. Zhou, K. Goel, K. S. Refaat, and B. Sapp, “Wayformer: Motion forecasting via simple efficient attention networks,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 2980–2987.
  24. T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “GOHOME: graph-oriented heatmap output for future motion estimation,” CoRR, vol. abs/2109.01827, 2021. [Online]. Available: https://arxiv.org/abs/2109.01827
  25. K. Mangalam, Y. An, H. Girase, and J. Malik, “From goals, waypoints & paths to long term human trajectory forecasting,” CoRR, vol. abs/2012.01526, 2020. [Online]. Available: https://arxiv.org/abs/2012.01526
  26. W. Ding, W. Wang, and D. Zhao, “Multi-vehicle trajectories generation for vehicle-to-vehicle encounters,” in 2019 IEEE International Conference on Robotics and Automation (ICRA), 2019.
  27. G. Oh and H. Peng, “Cvae-h: Conditionalizing variational autoencoders via hypernetworks and trajectory forecasting for autonomous driving,” arXiv preprint arXiv:2201.09874, 2022.
  28. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” 2020.
  29. W. Mao, C. Xu, Q. Zhu, S. Chen, and Y. Wang, “Leapfrog diffusion model for stochastic trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5517–5526.
  30. W. LLC, “Waymo open dataset: An autonomous driving dataset,” 2019.

Summary

We haven't generated a summary for this paper yet.