Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying (2306.17770v2)

Published 30 Jun 2023 in cs.CV

Abstract: Motion prediction is crucial for autonomous driving systems to understand complex driving scenarios and make informed decisions. However, this task is challenging due to the diverse behaviors of traffic participants and complex environmental contexts. In this paper, we propose Motion TRansformer (MTR) frameworks to address these challenges. The initial MTR framework utilizes a transformer encoder-decoder structure with learnable intention queries, enabling efficient and accurate prediction of future trajectories. By customizing intention queries for distinct motion modalities, MTR improves multimodal motion prediction while reducing reliance on dense goal candidates. The framework comprises two essential processes: global intention localization, identifying the agent's intent to enhance overall efficiency, and local movement refinement, adaptively refining predicted trajectories for improved accuracy. Moreover, we introduce an advanced MTR++ framework, extending the capability of MTR to simultaneously predict multimodal motion for multiple agents. MTR++ incorporates symmetric context modeling and mutually-guided intention querying modules to facilitate future behavior interaction among multiple agents, resulting in scene-compliant future trajectories. Extensive experimental results demonstrate that the MTR framework achieves state-of-the-art performance on the highly-competitive motion prediction benchmarks, while the MTR++ framework surpasses its precursor, exhibiting enhanced performance and efficiency in predicting accurate multimodal future trajectories for multiple agents.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. Argoverse 2. Argoverse 2: Motion forecasting competition. https://eval.ai/web/challenges/challenge-page/1719/leaderboard/4098, 2022. Accessed: 2022-08-02.
  2. Social lstm: Human trajectory prediction in crowded spaces. In CVPR, 2016.
  3. Beit: Bert pre-training of image transformers. In ICLR, 2022.
  4. Prank: motion prediction based on ranking. In NeurIPS, 2020.
  5. End-to-end object detection with transformers. In ECCV, 2020.
  6. Spagnn: Spatially-aware graph neural networks for relational behavior forecasting from sensor data. In ICRA, 2020.
  7. Implicit latent variable model for scene-consistent motion forecasting. In ECCV, 2020.
  8. Intentnet: Learning to predict intention from raw sensor data. In CoRL, 2018.
  9. Mp3: A unified model to map, perceive, predict and plan. In CVPR, 2021.
  10. Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction. In CoRL, 2019.
  11. Mppnet: Multi-frame feature intertwining with proxy points for 3d temporal object detection. In ECCV, 2022.
  12. Bert: Pre-training of deep bidirectional transformers for language understanding. In arXiv preprint arXiv:1810.04805, 2018.
  13. Uncertainty-aware short-term motion prediction of traffic actors for autonomous driving. In WACV, 2020.
  14. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
  15. Large scale interactive motion forecasting for autonomous driving: The waymo open motion dataset. In ICCV, 2021.
  16. Tpnet: Trajectory proposal network for motion prediction. In CVPR, 2020.
  17. Vectornet: Encoding hd maps and agent dynamics from vectorized representation. In CVPR, 2020.
  18. Gohome: Graph-oriented heatmap output for future motion estimation. In arXiv preprint arXiv:2109.01827, 2021.
  19. Home: Heatmap output for future motion estimation. In ITSC, 2021.
  20. Thomas: Trajectory heatmap output with learned multi-agent sampling. In arXiv preprint arXiv:2110.06607, 2021.
  21. Densetnt: End-to-end trajectory prediction from dense goal sets. In ICCV, 2021.
  22. Social gan: Socially acceptable trajectories with generative adversarial networks. In CVPR, 2018.
  23. Rules of the road: Predicting driving behavior with a convolutional model of semantic interactions. In CVPR, 2019.
  24. Towards capturing the temporal dynamics for trajectory prediction: a coarse-to-fine approach. In CoRL, 2023.
  25. Ide-net: Interactive driving event and pattern extraction from human data. RA-L, 2021.
  26. Multi-agent trajectory prediction by combining egocentric and allocentric views. In CoRL, 2022.
  27. Hdgt: Heterogeneous driving graph transformer for multi-agent trajectory prediction via scene encoding. In arXiv preprint arXiv:2205.09753, 2022.
  28. Motioncnn: A strong baseline for motion prediction in autonomous driving. In Workshop on Autonomous Driving, CVPR, 2021.
  29. Stratified transformer for 3d point cloud segmentation. In CVPR, 2022.
  30. Evolvegraph: Multi-agent trajectory prediction with dynamic relational reasoning. In NeurIPS, 2020.
  31. Learning lane graph representations for motion forecasting. In ECCV, 2020.
  32. DAB-DETR: Dynamic anchor boxes are better queries for DETR. In ICLR, 2022.
  33. Multimodal motion prediction with stacked transformers. In CVPR, 2021.
  34. It is not the journey but the destination: Endpoint conditioned trajectory prediction. In ECCV, 2020.
  35. Mantra: Memory augmented networks for multiple trajectory prediction. In CVPR, 2020.
  36. Conditional detr for fast training convergence. In ICCV, 2021.
  37. Multi-head attention for multi-modal joint vehicle motion forecasting. In ICRA, 2020.
  38. Multi-modal interactive agent trajectory prediction using heterogeneous edge-enhanced graph attention network. In Workshop on Autonomous Driving, CVPR, 2021.
  39. Scene transformer: A unified architecture for predicting future trajectories of multiple agents. In ICLR, 2022.
  40. Diverse and admissible trajectory forecasting through multimodal context understanding. In ECCV, 2020.
  41. Covernet: Multimodal behavior prediction using trajectory sets. In CVPR, 2020.
  42. Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, 2017.
  43. R2p2: A reparameterized pushforward policy for diverse, precise generative path forecasting. In ECCV, 2018.
  44. Precog: Prediction conditioned on goals in visual multi-agent settings. In ICCV, 2019.
  45. Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. In ECCV, 2020.
  46. Motion transformer with global intention localization and local movement refinement. In NeurIPS, 2022.
  47. M2i: From factored marginal trajectory prediction to interactive prediction. In CVPR, 2022.
  48. Multiple futures prediction. In NeurIPS, 2019.
  49. Identifying driver interactions via conditional behavior prediction. In ICRA, 2021.
  50. Multipath++: Efficient information fusion and trajectory aggregation for behavior prediction. In ICRA, 2022.
  51. Attention is all you need. In NeurIPS, 2017.
  52. Dsvt: Dynamic sparse voxel transformer with rotated sets. In CVPR, 2023.
  53. Multi-person 3d motion prediction with multi-range transformers. In NeurIPS, 2021.
  54. Non-local neural networks. In CVPR, 2018.
  55. Tenet: Transformer encoding network for effective temporal flow on motion prediction. In arXiv preprint arXiv:2207.00170, 2022.
  56. Waymo. Waymo open dataset interaction prediction challenge 2021. https://waymo.com/open/challenges/2021/interaction-prediction/, 2021. Accessed: 2023-05-25.
  57. Waymo. Waymo open dataset motion prediction challenge 2022. https://waymo.com/open/challenges/2022/motion-prediction/, 2022. Accessed: 2023-05-25.
  58. Waymo. Waymo open dataset motion prediction challenge 2023. https://waymo.com/open/challenges/2023/motion-prediction/, 2023. Accessed: 2023-05-25.
  59. Argoverse 2: Next generation datasets for self-driving perception and forecasting. In NeurIPS, 2021.
  60. Air2 for interaction prediction. In Workshop on Autonomous Driving, CVPR, 2021.
  61. Tra2tra: Trajectory-to-trajectory prediction with a global social spatial-temporal attentive neural network. In RA-L, 2021.
  62. A unified query-based paradigm for point cloud understanding. In CVPR, 2022.
  63. Tpcn: Temporal point cloud networks for motion forecasting. In CVPR, 2021.
  64. Motr: End-to-end multiple-object tracking with transformer. In arXiv preprint arXiv:2105.03247, 2021.
  65. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. In arXiv preprint arXiv:2203.03605, 2022.
  66. A novel learning framework for sampling-based motion planning in autonomous driving. In AAAI, 2020.
  67. Tnt: Target-driven trajectory prediction. In CoRL, 2020.
  68. Chen Lv Zhiyu Huang, Xiaoyu Mo. Recoat: A deep learning framework with attention mechanism for multi-modal motion prediction. In Workshop on Autonomous Driving, CVPR, 2021.
  69. Hivt: Hierarchical vector transformer for multi-agent motion prediction. In CVPR, 2022.
  70. Deformable detr: Deformable transformers for end-to-end object detection. In ICLR, 2021.
  71. Starnet: Pedestrian trajectory prediction using deep neural network in star topology. In IROS, 2019.
Citations (83)

Summary

We haven't generated a summary for this paper yet.