Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ControlMTR: Control-Guided Motion Transformer with Scene-Compliant Intention Points for Feasible Motion Prediction (2404.10295v2)

Published 16 Apr 2024 in cs.RO

Abstract: The ability to accurately predict feasible multimodal future trajectories of surrounding traffic participants is crucial for behavior planning in autonomous vehicles. The Motion Transformer (MTR), a state-of-the-art motion prediction method, alleviated mode collapse and instability during training and enhanced overall prediction performance by replacing conventional dense future endpoints with a small set of fixed prior motion intention points. However, the fixed prior intention points make the MTR multi-modal prediction distribution over-scattered and infeasible in many scenarios. In this paper, we propose the ControlMTR framework to tackle the aforementioned issues by generating scene-compliant intention points and additionally predicting driving control commands, which are then converted into trajectories by a simple kinematic model with soft constraints. These control-generated trajectories will guide the directly predicted trajectories by an auxiliary loss function. Together with our proposed scene-compliant intention points, they can effectively restrict the prediction distribution within the road boundaries and suppress infeasible off-road predictions while enhancing prediction performance. Remarkably, without resorting to additional model ensemble techniques, our method surpasses the baseline MTR model across all performance metrics, achieving notable improvements of 5.22% in SoftmAP and a 4.15% reduction in MissRate. Our approach notably results in a 41.85% reduction in the cross-boundary rate of the MTR, effectively ensuring that the prediction distribution is confined within the drivable area.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. S. Shi, L. Jiang, D. Dai, and B. Schiele, “Motion Transformer with Global Intention Localization and Local Movement Refinement,” Mar. 2023, arXiv:2209.13508 [cs].
  2. B. Varadarajan, A. Hefny, A. Srivastava, K. S. Refaat, N. Nayakanti, A. Cornman, K. Chen, B. Douillard, C. P. Lam, D. Anguelov, and B. Sapp, “MultiPath++: Efficient Information Fusion and Trajectory Aggregation for Behavior Prediction,” Dec. 2021, arXiv:2111.14973 [cs].
  3. Z. Huang, H. Liu, J. Wu, and C. Lv, “Differentiable integrated motion prediction and planning with learnable cost function for autonomous driving,” IEEE transactions on neural networks and learning systems, 2023.
  4. J. Gu, C. Sun, and H. Zhao, “DenseTNT: End-to-end Trajectory Prediction from Dense Goal Sets,” Nov. 2021, arXiv:2108.09640 [cs].
  5. Y. Chai, B. Sapp, M. Bansal, and D. Anguelov, “MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction,” Oct. 2019, arXiv:1910.05449 [cs, stat].
  6. T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “HOME: Heatmap Output for future Motion Estimation,” Jun. 2021, arXiv:2105.10968 [cs].
  7. Z. Huang, X. Mo, and C. Lv, “Recoat: A deep learning-based framework for multi-modal motion prediction in autonomous driving application,” in 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), 2022, pp. 988–993.
  8. M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” Sep. 2020, arXiv:1905.11946 [cs, stat].
  9. S. Konev, K. Brodt, and A. Sanakoyeu, “Motioncnn: A strong baseline for motion prediction in autonomous driving,” 2022.
  10. H. Cui, V. Radosavljevic, F.-C. Chou, T.-H. Lin, T. Nguyen, T.-K. Huang, J. Schneider, and N. Djuric, “Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks,” Mar. 2019, arXiv:1809.10732 [cs, stat].
  11. A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social LSTM: Human Trajectory Prediction in Crowded Spaces,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).   Las Vegas, NV, USA: IEEE, Jun. 2016, pp. 961–971.
  12. J. Gao, C. Sun, H. Zhao, Y. Shen, D. Anguelov, C. Li, and C. Schmid, “VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation,” May 2020, arXiv:2005.04259 [cs, stat].
  13. M. Liang, B. Yang, R. Hu, Y. Chen, R. Liao, S. Feng, and R. Urtasun, “Learning Lane Graph Representations for Motion Forecasting,” Jul. 2020, arXiv:2007.13732 [cs].
  14. J. Ngiam, B. Caine, V. Vasudevan, Z. Zhang, H.-T. L. Chiang, J. Ling, R. Roelofs, A. Bewley, C. Liu, A. Venugopal, D. Weiss, B. Sapp, Z. Chen, and J. Shlens, “Scene Transformer: A unified architecture for predicting multiple agent trajectories,” Mar. 2022, arXiv:2106.08417 [cs].
  15. Q. Sun, X. Huang, J. Gu, B. C. Williams, and H. Zhao, “M2i: From factored marginal trajectory prediction to interactive prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6543–6552.
  16. Z. Zhou, J. Wang, Y.-H. Li, and Y.-K. Huang, “Query-centric trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 17 863–17 873.
  17. Y. Liu, J. Zhang, L. Fang, Q. Jiang, and B. Zhou, “Multimodal Motion Prediction with Stacked Transformers,” Mar. 2021, arXiv:2103.11624 [cs].
  18. L. Fang, Q. Jiang, J. Shi, and B. Zhou, “TPNet: Trajectory Proposal Network for Motion Prediction,” Feb. 2021, arXiv:2004.12255 [cs].
  19. T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “GOHOME: Graph-Oriented Heatmap Output for future Motion Estimation,” Sep. 2021, arXiv:2109.01827 [cs].
  20. S. Shi, L. Jiang, D. Dai, and B. Schiele, “MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying,” Jun. 2023, arXiv:2306.17770 [cs].
  21. X. Zheng, L. Wu, Z. Yan, Y. Tang, H. Zhao, C. Zhong, B. Chen, and J. Gong, “Large language models powered context-aware motion prediction,” arXiv preprint arXiv:2403.11057, 2024.
  22. C. Feng, H. Zhou, H. Lin, Z. Zhang, Z. Xu, C. Zhang, B. Zhou, and S. Shen, “MacFormer: Map-Agent Coupled Transformer for Real-time and Robust Trajectory Prediction,” Aug. 2023, arXiv:2308.10280 [cs].
Citations (7)

Summary

We haven't generated a summary for this paper yet.