Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Consistent and Explainable Motion Prediction using Heterogeneous Graph Attention (2405.10134v1)

Published 16 May 2024 in cs.RO and cs.AI

Abstract: In autonomous driving, accurately interpreting the movements of other road users and leveraging this knowledge to forecast future trajectories is crucial. This is typically achieved through the integration of map data and tracked trajectories of various agents. Numerous methodologies combine this information into a singular embedding for each agent, which is then utilized to predict future behavior. However, these approaches have a notable drawback in that they may lose exact location information during the encoding process. The encoding still includes general map information. However, the generation of valid and consistent trajectories is not guaranteed. This can cause the predicted trajectories to stray from the actual lanes. This paper introduces a new refinement module designed to project the predicted trajectories back onto the actual map, rectifying these discrepancies and leading towards more consistent predictions. This versatile module can be readily incorporated into a wide range of architectures. Additionally, we propose a novel scene encoder that handles all relations between agents and their environment in a single unified heterogeneous graph attention network. By analyzing the attention values on the different edges in this graph, we can gain unique insights into the neural network's inner workings leading towards a more explainable prediction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. M. Liang, B. Yang, R. Hu, Y. Chen, R. Liao, S. Feng, and R. Urtasun, “Learning lane graph representations for motion forecasting,” in ECCV, 2020.
  2. Y. Chai, B. Sapp, M. Bansal, and D. Anguelov, “MultiPath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction,” CoRR, vol. abs/1910.05449, 2019.
  3. T. Gilles, S. Sabatini, D. V. Tsishkou, B. Stanciulescu, and F. Moutarde, “Home: Heatmap output for future motion estimation,” 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), pp. 500–507, 2021.
  4. T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data,” in European Conference on Computer Vision, 2020.
  5. T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “Gohome: Graph-oriented heatmap output for future motion estimation,” in 2022 international conference on robotics and automation (ICRA).   IEEE, 2022, pp. 9107–9114.
  6. A. Cui, S. Casas, K. Wong, S. Suo, and R. Urtasun, “Gorela: Go relative for viewpoint-invariant motion forecasting,” 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 7801–7807, 2022.
  7. T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “Thomas: Trajectory heatmap output with learned multi-agent sampling,” ICLR, 2021.
  8. S. Shi, L. Jiang, D. Dai, and B. Schiele, “MTR++: Multi-agent motion prediction with symmetric scene modeling and guided intention querying,” arXiv preprint arXiv:2306.17770, 2023.
  9. Z. Zhou, J. Wang, Y.-H. Li, and Y.-K. Huang, “Query-centric trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  10. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, pp. 84 – 90, 2012.
  11. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2015.
  12. J. Gao, C. Sun, H. Zhao, Y. Shen, D. Anguelov, C. Li, and C. Schmid, “VectorNet: Encoding hd maps and agent dynamics from vectorized representation,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11 522–11 530, 2020.
  13. T. Brown, B. Mann, and N. e. a. Ryder, “Language models are few-shot learners,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33.   Curran Associates, Inc., 2020, pp. 1877–1901.
  14. H. Touvron, L. Martin, K. Stone, and P. A. et al., “Llama 2: Open foundation and fine-tuned chat models,” 2023.
  15. S. Shi, L. Jiang, D. Dai, and B. Schiele, “MTR-A: 1st place solution for 2022 waymo open dataset challenge–motion prediction,” arXiv preprint arXiv:2209.10033, 2022.
  16. N. Nayakanti, R. Al-Rfou, A. Zhou, K. Goel, K. S. Refaat, and B. Sapp, “Wayformer: Motion forecasting via simple & efficient attention networks,” 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 2980–2987, 2022.
  17. S. Shi, L. Jiang, D. Dai, and B. Schiele, “Motion transformer with global intention localization and local movement refinement,” Advances in Neural Information Processing Systems, 2022.
  18. H. Zhao, J. Gao, T. Lan, C. Sun, B. Sapp, B. Varadarajan, Y. Shen, Y. Shen, Y. Chai, C. Schmid, C. Li, and D. Anguelov, “TNT: Target-driven trajectory prediction,” in Conference on Robot Learning, 2020.
  19. S. Brody, U. Alon, and E. Yahav, “How attentive are graph attention networks?” in International Conference on Learning Representations, 2022.
  20. P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph Attention Networks,” International Conference on Learning Representations, 2018.
  21. Y. Shi, H. Zhengjie, S. Feng, H. Zhong, W. Wang, and Y. Sun, “Masked label prediction: Unified message passing model for semi-supervised classification,” 08 2021, pp. 1548–1554.
  22. B. Wilson, W. Qi, T. Agarwal, and et al., “Argoverse 2: Next generation datasets for self-driving perception and forecasting,” in NIPS, 2021.
  23. M.-F. Chang, J. W. Lambert, P. Sangkloy, and et al., “Argoverse: 3d tracking and forecasting with rich maps,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Tobias Demmler (3 papers)
  2. Andreas Tamke (2 papers)
  3. Thao Dang (15 papers)
  4. Karsten Haug (5 papers)
  5. Lars Mikelsons (17 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.