SceneDM: Scene-level Multi-agent Trajectory Generation with Consistent Diffusion Models (2311.15736v1)
Abstract: Realistic scene-level multi-agent motion simulations are crucial for developing and evaluating self-driving algorithms. However, most existing works focus on generating trajectories for a certain single agent type, and typically ignore the consistency of generated trajectories. In this paper, we propose a novel framework based on diffusion models, called SceneDM, to generate joint and consistent future motions of all the agents, including vehicles, bicycles, pedestrians, etc., in a scene. To enhance the consistency of the generated trajectories, we resort to a new Transformer-based network to effectively handle agent-agent interactions in the inverse process of motion diffusion. In consideration of the smoothness of agent trajectories, we further design a simple yet effective consistent diffusion approach, to improve the model in exploiting short-term temporal dependencies. Furthermore, a scene-level scoring function is attached to evaluate the safety and road-adherence of the generated agent's motions and help filter out unrealistic simulations. Finally, SceneDM achieves state-of-the-art results on the Waymo Sim Agents Benchmark. Project webpage is available at https://alperen-hub.github.io/SceneDM.
- Carla: An open urban driving simulator. In Conference on robot learning, pages 1–16. PMLR, 2017.
- Ronald T Hay. Sumo: a history of modification. Molecular cell, 18(1):1–12, 2005.
- Trafficgen: Learning to generate diverse and realistic traffic scenarios. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 3567–3575. IEEE, 2023.
- Scenegen: Learning to generate realistic traffic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 892–901, 2021.
- Multipath++: Efficient information fusion and trajectory aggregation for behavior prediction. In 2022 International Conference on Robotics and Automation (ICRA), pages 7814–7821. IEEE, 2022.
- Driving scenario generation using generative adversarial networks. 2021.
- Modeling human driving behavior through generative adversarial imitation learning. IEEE Transactions on Intelligent Transportation Systems, 24(3):2874–2887, 2022.
- Multimodal safety-critical scenarios generation for decision-making algorithms evaluation. IEEE Robotics and Automation Letters, 6(2):1551–1558, 2021.
- Diverse critical interaction generation for planning and planner evaluation. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7036–7043. IEEE, 2021.
- Multi-vehicle trajectories generation for vehicle-to-vehicle encounters. In 2019 IEEE International Conference on Robotics and Automation (ICRA), 2019.
- CVAE-H: Conditionalizing variational autoencoders via hypernetworks and trajectory forecasting for autonomous driving. arXiv preprint arXiv:2201.09874, 2022.
- Stochastic trajectory prediction via motion indeterminacy diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17113–17122, 2022.
- Motiondiffuser: Controllable multi-agent motion prediction using diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9644–9653, 2023.
- Guided conditional diffusion for controllable traffic simulation. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 3560–3566. IEEE, 2023.
- Control-a-video: Controllable text-to-video generation with diffusion models. arXiv preprint arXiv:2305.13840, 2023.
- Simnet: Learning reactive self-driving simulations from real-world observations. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 5119–5125. IEEE, 2021.
- Trafficsim: Learning to simulate realistic multi-agent behaviors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10400–10409, 2021.
- The 2nd place solution for 2023 waymo open sim agents challenge. arXiv preprint arXiv:2306.15914, 2023.
- Motion transformer with global intention localization and local movement refinement. Advances in Neural Information Processing Systems, 35:6531–6543, 2022.
- Tae: A semi-supervised controllable behavior-aware trajectory generator and predictor. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 12534–12541. IEEE, 2022.
- Generating driving scenes with diffusion. arXiv preprint arXiv:2305.18452, 2023.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
- Dynamic scenario representation learning for motion forecasting with heterogeneous graph convolutional recurrent networks. IEEE Robotics and Automation Letters, 8(5):2946–2953, 2023.
- Scene transformer: A unified architecture for predicting future trajectories of multiple agents. In International Conference on Learning Representations, 2021.
- Denoising diffusion implicit models. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021.
- Shashidhara K Ganjugunte. A survey on techniques for computing penetration depth. 2007.
- The waymo open sim agents challenge. arXiv preprint arXiv:2305.12032, 2023.