Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SceneDM: Scene-level Multi-agent Trajectory Generation with Consistent Diffusion Models (2311.15736v1)

Published 27 Nov 2023 in cs.RO and cs.AI

Abstract: Realistic scene-level multi-agent motion simulations are crucial for developing and evaluating self-driving algorithms. However, most existing works focus on generating trajectories for a certain single agent type, and typically ignore the consistency of generated trajectories. In this paper, we propose a novel framework based on diffusion models, called SceneDM, to generate joint and consistent future motions of all the agents, including vehicles, bicycles, pedestrians, etc., in a scene. To enhance the consistency of the generated trajectories, we resort to a new Transformer-based network to effectively handle agent-agent interactions in the inverse process of motion diffusion. In consideration of the smoothness of agent trajectories, we further design a simple yet effective consistent diffusion approach, to improve the model in exploiting short-term temporal dependencies. Furthermore, a scene-level scoring function is attached to evaluate the safety and road-adherence of the generated agent's motions and help filter out unrealistic simulations. Finally, SceneDM achieves state-of-the-art results on the Waymo Sim Agents Benchmark. Project webpage is available at https://alperen-hub.github.io/SceneDM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Carla: An open urban driving simulator. In Conference on robot learning, pages 1–16. PMLR, 2017.
  2. Ronald T Hay. Sumo: a history of modification. Molecular cell, 18(1):1–12, 2005.
  3. Trafficgen: Learning to generate diverse and realistic traffic scenarios. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 3567–3575. IEEE, 2023.
  4. Scenegen: Learning to generate realistic traffic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 892–901, 2021.
  5. Multipath++: Efficient information fusion and trajectory aggregation for behavior prediction. In 2022 International Conference on Robotics and Automation (ICRA), pages 7814–7821. IEEE, 2022.
  6. Driving scenario generation using generative adversarial networks. 2021.
  7. Modeling human driving behavior through generative adversarial imitation learning. IEEE Transactions on Intelligent Transportation Systems, 24(3):2874–2887, 2022.
  8. Multimodal safety-critical scenarios generation for decision-making algorithms evaluation. IEEE Robotics and Automation Letters, 6(2):1551–1558, 2021.
  9. Diverse critical interaction generation for planning and planner evaluation. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7036–7043. IEEE, 2021.
  10. Multi-vehicle trajectories generation for vehicle-to-vehicle encounters. In 2019 IEEE International Conference on Robotics and Automation (ICRA), 2019.
  11. CVAE-H: Conditionalizing variational autoencoders via hypernetworks and trajectory forecasting for autonomous driving. arXiv preprint arXiv:2201.09874, 2022.
  12. Stochastic trajectory prediction via motion indeterminacy diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17113–17122, 2022.
  13. Motiondiffuser: Controllable multi-agent motion prediction using diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9644–9653, 2023.
  14. Guided conditional diffusion for controllable traffic simulation. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 3560–3566. IEEE, 2023.
  15. Control-a-video: Controllable text-to-video generation with diffusion models. arXiv preprint arXiv:2305.13840, 2023.
  16. Simnet: Learning reactive self-driving simulations from real-world observations. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 5119–5125. IEEE, 2021.
  17. Trafficsim: Learning to simulate realistic multi-agent behaviors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10400–10409, 2021.
  18. The 2nd place solution for 2023 waymo open sim agents challenge. arXiv preprint arXiv:2306.15914, 2023.
  19. Motion transformer with global intention localization and local movement refinement. Advances in Neural Information Processing Systems, 35:6531–6543, 2022.
  20. Tae: A semi-supervised controllable behavior-aware trajectory generator and predictor. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 12534–12541. IEEE, 2022.
  21. Generating driving scenes with diffusion. arXiv preprint arXiv:2305.18452, 2023.
  22. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
  23. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  24. Dynamic scenario representation learning for motion forecasting with heterogeneous graph convolutional recurrent networks. IEEE Robotics and Automation Letters, 8(5):2946–2953, 2023.
  25. Scene transformer: A unified architecture for predicting future trajectories of multiple agents. In International Conference on Learning Representations, 2021.
  26. Denoising diffusion implicit models. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021.
  27. Shashidhara K Ganjugunte. A survey on techniques for computing penetration depth. 2007.
  28. The waymo open sim agents challenge. arXiv preprint arXiv:2305.12032, 2023.
Citations (15)

Summary

  • The paper introduces a Transformer-based method for synchronized multi-agent trajectory generation that realistically simulates complex traffic interactions.
  • It employs consistent diffusion to ensure locally smooth movements and prevent collisions while adhering to traffic laws.
  • The scene-level scoring function validates safety and road adherence, outperforming benchmarks in realistic traffic scenario testing.

Introduction to Scene-level Multi-agent Trajectory Generation

The challenges of creating realistic virtual traffic environments involve accurately simulating the interactions and movements of different agents like vehicles, bicycles, and pedestrians. This is essential for developing and validating autonomous vehicle technology. Traditional methods often focus on single agent types and lack the ability to realistically model the varied behaviors present in complex scenes.

A Novel Approach with SceneDM

The method presented in this paper, known as SceneDM, centers around a Transformer-based network to generate multiple, synchronized future trajectories for different agents within a scene. This approach aims to maintain consistency across these trajectories, meaning that they avoid unrealistic movements, such as collisions, and adhere to traffic laws, like staying within road boundaries.

Enhancements via Consistent Diffusion

SceneDM introduces what is called consistent diffusion to enhance local smoothness, ensuring that the motion paths it generates closely resemble the natural movement patterns of agents. The process involves partially overlapping the noise added during sequence generation to preserve the continuity of motion from one state to the next. In doing this, SceneDM effectively captures the temporal dependencies and interaction dynamics between agents.

Evaluating Generated Scenes

To ensure the practicality of simulations, SceneDM incorporates a scene-level scoring function, which evaluates the safety and road-adherence of each generated motion. This helps in filtering out any unrealistic or rule-violating scenarios. Publicly available benchmark results underscore SceneDM's leading performance, particularly in creating smooth trajectories that reflect the complex and interactive nature of real-world traffic.

Conclusion

Overall, SceneDM represents a significant step forward in traffic simulation technology. It not only generates trajectories for multiple types of agents but also emphasizes the importance of keeping these trajectories locally smooth and consistent with each other. Such advancements promise to enhance the testing of autonomous driving systems, making simulations closer to real-world complexities and thus more reliable for safety validation purposes.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub