Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bootstrap Motion Forecasting With Self-Consistent Constraints (2204.05859v4)

Published 12 Apr 2022 in cs.CV and cs.RO

Abstract: We present a novel framework to bootstrap Motion forecasting with Self-consistent Constraints (MISC). The motion forecasting task aims at predicting future trajectories of vehicles by incorporating spatial and temporal information from the past. A key design of MISC is the proposed Dual Consistency Constraints that regularize the predicted trajectories under spatial and temporal perturbation during training. Also, to model the multi-modality in motion forecasting, we design a novel self-ensembling scheme to obtain accurate teacher targets to enforce the self-constraints with multi-modality supervision. With explicit constraints from multiple teacher targets, we observe a clear improvement in the prediction performance. Extensive experiments on the Argoverse motion forecasting benchmark and Waymo Open Motion dataset show that MISC significantly outperforms the state-of-the-art methods. As the proposed strategies are general and can be easily incorporated into other motion forecasting approaches, we also demonstrate that our proposed scheme consistently improves the prediction performance of several existing methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Learning with pseudo-ensembles. Advances in neural information processing systems, 27:3365–3373, 2014.
  2. Chauffeurnet. In Robotics: Science and Systems XV, 2019.
  3. Ssl-lanes: Self-supervised learning for motion forecasting in autonomous driving. arXiv preprint arXiv:2206.14116, 2022.
  4. Quo vadis? meaningful multiple trajectory hypotheses prediction in autonomous driving. In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), pages 637–644. IEEE, 2021.
  5. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6154–6162, 2018.
  6. Intentnet: Learning to predict intention from raw sensor data. In Conference on Robot Learning, pages 947–956, 2018.
  7. Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction. arXiv preprint arXiv:1910.05449, 2019.
  8. Improving motion forecasting for autonomous driving with the cycle consistency loss. arXiv preprint arXiv:2211.00149, 2022.
  9. Argoverse: 3d tracking and forecasting with rich maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8748–8757, 2019.
  10. Thomas G Dietterich. Ensemble methods in machine learning. In International workshop on multiple classifier systems, pages 1–15. Springer, 2000.
  11. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems, pages 2224–2232, 2015.
  12. Large scale interactive motion forecasting for autonomous driving: The waymo open motion dataset. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9710–9719, 2021.
  13. Baidu apollo em motion planner. arXiv preprint arXiv:1807.08048, 2018.
  14. Peter Földiák. Learning invariance from transformation sequences. Neural computation, 3(2):194–200, 1991.
  15. Vectornet: Encoding hd maps and agent dynamics from vectorized representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11525–11533, 2020.
  16. Home: Heatmap output for future motion estimation. arXiv preprint arXiv:2105.10968, 2021.
  17. Gohome: Graph-oriented heatmap output for future motion estimation. In 2022 International Conference on Robotics and Automation (ICRA), pages 9107–9114. IEEE, 2022.
  18. Densetnt: End-to-end trajectory prediction from dense goal sets. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15303–15312, 2021.
  19. Multiple choice learning: Learning to produce multiple structured outputs. Advances in neural information processing systems, 25, 2012.
  20. Structure aware single-stage 3d object detection from point cloud. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11873–11882, 2020.
  21. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163, 2015.
  22. Vehicle trajectory prediction based on motion model and maneuver recognition. In 2013 IEEE/RSJ international conference on intelligent robots and systems, pages 4363–4369. IEEE, 2013.
  23. Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. J. Basic Eng, 82(1):35–45, 1960.
  24. Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242, 2016.
  25. Dong-Hyun Lee et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, volume 3, page 896, 2013.
  26. Stochastic multiple choice learning for training diverse deep ensembles. In Advances in Neural Information Processing Systems, pages 2119–2127, 2016.
  27. Blind video temporal consistency via deep video prior. Advances in Neural Information Processing Systems, 33, 2020.
  28. Learning lane graph representations for motion forecasting. In Proceedings of the European Conference on Computer Vision (ECCV), pages 541–556, 2020.
  29. Multimodal motion prediction with stacked transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7577–7586, 2021.
  30. Kemp: Keyframe-based hierarchical end-to-end deep model for long-term trajectory prediction. arXiv preprint arXiv:2205.04624, 2022.
  31. James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA, 1967.
  32. Overcoming limitations of mixture density networks: A sampling and fitting framework for multimodal future prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7144–7153, 2019.
  33. It is not the journey but the destination: Endpoint conditioned trajectory prediction. arXiv preprint arXiv:2004.02025, 2020.
  34. Multi-head attention for multi-modal joint vehicle motion forecasting. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 9638–9644. IEEE, 2020.
  35. Divide-and-conquer for lane-aware diverse trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15799–15808, 2021.
  36. Scene transformer: A unified multi-task model for behavior prediction and planning. arXiv preprint arXiv:2106.08417, 2021.
  37. Popular ensemble methods: An empirical study. Journal of artificial intelligence research, 11:169–198, 1999.
  38. Internal video inpainting by implicit long-range propagation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 14579–14588, October 2021.
  39. Covernet: Multimodal behavior prediction using trajectory sets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14074–14083, 2020.
  40. Faster r-cnn: towards real-time object detection with region proposal networks. In International Conference on Neural Information Processing Systems, 2015.
  41. Learning in an uncertain world: Representing ambiguity through multiple hypotheses. In Proceedings of the IEEE International Conference on Computer Vision, pages 3591–3600, 2017.
  42. Regularization with stochastic transformations and perturbations for deep semi-supervised learning. Advances in neural information processing systems, 29:1163–1171, 2016.
  43. Interaction-aware probabilistic behavior prediction in urban environments. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3999–4006. IEEE, 2018.
  44. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE signal processing magazine, 30(3):83–98, 2013.
  45. Tangent prop-a formalism for specifying selected invariances in an adaptive network. Advances in neural information processing systems, 4, 1991.
  46. Learning to predict vehicle trajectories with model-based planning. In Conference on Robot Learning, pages 1035–1045. PMLR, 2021.
  47. A hierarchical network for diverse trajectory proposals. In 2019 IEEE Intelligent Vehicles Symposium (IV), pages 689–694. IEEE, 2019.
  48. Narrowing the coordinate-frame gap in behavior prediction models: Distillation for efficient and accurate scene-centric motion forecasting. In 2022 International Conference on Robotics and Automation (ICRA), pages 653–659. IEEE, 2022.
  49. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2446–2454, 2020.
  50. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems, 30, 2017.
  51. Multipath++: Efficient information fusion and trajectory aggregation for behavior prediction. arXiv preprint arXiv:2111.14973, 2021.
  52. Dual-camera super-resolution with aligned attention modules. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 2001–2010, October 2021.
  53. Learning correspondence from the cycle-consistency of time. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2566–2576, 2019.
  54. Vehicle trajectory prediction by integrating physics-and maneuver-based approaches using interactive multiple models. IEEE Transactions on Industrial Electronics, 65(7):5999–6008, 2017.
  55. Tpcn: Temporal point cloud networks for motion forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11318–11327, 2021.
  56. Lanercnn: Distributed representations for graph-centric motion forecasting. arXiv preprint arXiv:2101.06653, 2021.
  57. End-to-end interpretable neural motion planner. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8660–8669, 2019.
  58. Tnt: Target-driven trajectory prediction. arXiv preprint arXiv:2008.08294, 2020.
  59. Se-ssd: Self-ensembling single-stage object detector from point cloud. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14494–14503, 2021.
  60. Unsupervised learning of depth and ego-motion from video. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1851–1858, 2017.
  61. Hivt: Hierarchical vector transformer for multi-agent motion prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8823–8833, 2022.
  62. Making bertha drive—an autonomous journey on a historic route. IEEE Intelligent transportation systems magazine, 6(2):8–20, 2014.
Citations (19)

Summary

We haven't generated a summary for this paper yet.