Papers
Topics
Authors
Recent
2000 character limit reached

Transfer Learning Study of Motion Transformer-based Trajectory Predictions (2404.08271v3)

Published 12 Apr 2024 in cs.LG and cs.RO

Abstract: Trajectory planning in autonomous driving is highly dependent on predicting the emergent behavior of other road users. Learning-based methods are currently showing impressive results in simulation-based challenges, with transformer-based architectures technologically leading the way. Ultimately, however, predictions are needed in the real world. In addition to the shifts from simulation to the real world, many vehicle- and country-specific shifts, i.e. differences in sensor systems, fusion and perception algorithms as well as traffic rules and laws, are on the agenda. Since models that can cover all system setups and design domains at once are not yet foreseeable, model adaptation plays a central role. Therefore, a simulation-based study on transfer learning techniques is conducted on basis of a transformer-based model. Furthermore, the study aims to provide insights into possible trade-offs between computational time and performance to support effective transfers into the real world.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. S. Shi, L. Jiang, D. Dai et al., “Motion Transformer with Global Intention Localization and Local Movement Refinement,” Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 35, pp. 6531–6543, 2022.
  2. L. Lin, X. Lin, T. Lin et al., “EDA: Evolving and Distinct Anchors for Multimodal Motion Prediction,” arXiv preprint arXiv:2312.09501, 2023.
  3. Y. Gan, H. Xiao, Y. Zhao et al., “MGTR: Multi-Granular Transformer for Motion Prediction with LiDAR,” arXiv preprint arXiv:2312.02409, 2023.
  4. A. Geiger, P. Lenz, C. Stiller et al., “Vision meets robotics: The KITTI dataset,” Int. J. Robot. Res. (IJRR), vol. 32, no. 11, pp. 1231–1237, 2013.
  5. P. Sun, H. Kretzschmar, X. Dotiwalla et al., “Scalability in Perception for Autonomous Driving: Waymo Open Dataset,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit.(CVPR), 2020, pp. 2446–2454.
  6. H. Caesar, V. Bankiti, A. H. Lang et al., “nuScenes: A Multimodal Dataset for Autonomous Driving,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 11 621–11 631.
  7. M.-F. Chang, J. Lambert, P. Sangkloy et al., “Argoverse: 3D Tracking and Forecasting With Rich Maps,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 8748–8757.
  8. J. Houston, G. Zuidhof, L. Bergamini et al., “One thousand and one hours: Self-driving motion prediction dataset,” in Proc. 4th Conf. Rob. Learn. (CoRL).   PMLR, 2021, pp. 409–418.
  9. S. Ettinger, S. Cheng, B. Caine et al., “Large Scale Interactive Motion Forecasting for Autonomous Driving: The Waymo Open Motion Dataset,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2021, pp. 9710–9719.
  10. H. Caesar, J. Kabzan, K. S. Tan et al., “NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles,” arXiv preprint arXiv:2106.11810, 2021.
  11. L. Chen, P. Wu, K. Chitta et al., “End-to-end Autonomous Driving: Challenges and Frontiers,” arXiv preprint arXiv:2306.16927, 2023.
  12. A. Filos, P. Tigkas, R. Mcallister et al., “Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?” in Proc. 37th Int. Conf. Mach. Learn. (ICML).   PMLR, 2020, pp. 3145–3153.
  13. F. Codevilla, E. Santana, A. M. López et al., “Exploring the Limitations of Behavior Cloning for Autonomous Driving,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 9329–9338.
  14. S. Shi, L. Jiang, D. Dai et al., “MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying,” IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI), 2024.
  15. A. Vaswani, N. Shazeer, N. Parmar et al., “Attention is All you Need,” Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 30, pp. 5998–6008, 2017.
  16. C. Raffel, N. Shazeer, A. Roberts et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” J. Mach. Learn. Res. (JMLR), vol. 21, no. 1, pp. 5485–5551, 2020.
  17. M. Everingham, L. Van Gool, C. K. Williams et al., “The Pascal Visual Object Classes (VOC) Challenge,” Int. J. Comput. Vis. (IJCV), vol. 88, pp. 303–338, 2010.
  18. M. Kang, L. Shi, J. Dong et al., “IAIR+: 2nd Place Solution for 2023 Waymo Open Dataset Challenge - Motion Prediction,” National Key Laboratory of Human-Machine Hybrid Augmented, Xi’an Jiaotong University, Technical Report, 2023. [Online]. Available: https://storage.googleapis.com/waymo-uploads/files/research/2023%20Technical%20Reports/MP_2nd_IAIR%2B.pdf
  19. H. Liu, X. Mo, Z. Huang et al., “Transformer with Group-wise Modal Assignments for Motion Prediction,” Nanyang Technological University, Singapore, Technical Report, 2023. [Online]. Available: https://storage.googleapis.com/waymo-uploads/files/research/2023%20Technical%20Reports/MP_3rd_GRT-R36.pdf
  20. J. Gao, C. Sun, H. Zhao et al., “VectorNet: Encoding HD Maps and Agent Dynamics From Vectorized Representation,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 11 525–11 533.
  21. H. Zhao, J. Gao, T. Lan et al., “Tnt: Target-driven trajectory prediction,” in Proc. 4th Conf. on Rob. Learn. (CoRL).   PMLR, 2021, pp. 895–904.
  22. C. R. Qi, H. Su, K. Mo et al., “PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 652–660.
  23. J. Gu, C. Sun, and H. Zhao, “DenseTNT: End-to-End Trajectory Prediction From Dense Goal Sets,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2021, pp. 15 303–15 312.
  24. Y. Chai, B. Sapp, M. Bansal et al., “MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction,” in Proc. 3rd Conf. Rob. Learn. (CoRL).   PMLR, 2020, pp. 86–99.
  25. J. Ngiam, B. Caine, V. Vasudevan et al., “Scene transformer: A unified architecture for predicting multiple agent trajectories,” arXiv preprint arXiv:2106.08217, 2022.
  26. A. Malinin, N. Band, G. Chesnokov et al., “Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks,” arXiv preprint arXiv:2107.07455, 2021.
  27. A. Malinin, A. Athanasopoulos, M. Barakovic et al., “Shifts 2.0: Extending The Dataset of Real Distributional Shifts,” arXiv preprint arXiv:2206.15407, 2022.
  28. K. Zhou, Z. Liu, Y. Qiao et al., “Domain Generalization: A Survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 4, pp. 4396–4415, 2022.
  29. F. Zhuang, Z. Qi, K. Duan et al., “A Comprehensive Survey on Transfer Learning,” Proc. of the IEEE, vol. 109, no. 1, pp. 43–76, 2020.
  30. Y. Wang, Q. Yao, J. T. Kwok et al., “Generalizing from a Few Examples: A Survey on Few-shot Learning,” ACM computing surveys (csur), vol. 53, no. 3, pp. 1–34, 2020.
  31. S. Hanneke and S. Kpotufe, “On the Value of Target Data in Transfer Learning,” Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 32, pp. 9867–9877, 2019.
  32. S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Trans. Knowl. Data. Eng. (TKDE), vol. 22, no. 10, pp. 1345–1359, 2009.
  33. M. Wang and W. Deng, “Deep visual domain adaptation: A survey,” Neurocomputing, vol. 312, pp. 135–153, 2018.
  34. C. Tan, F. Sun, T. Kong et al., “A Survey on Deep Transfer Learning,” in Proc. 27th Int. Conf. Artif. Neural Netw. (ICANN), 2018, pp. 270–279.
  35. R. Caruana, “Multitask Learning,” Machine Learning, vol. 28, pp. 41–75, 1997.
  36. Y. Zhang and Q. Yang, “An overview of multi-task learning,” National Science Review, vol. 5, no. 1, pp. 30–43, 2018.
  37. Z. Zhang, P. Luo, C. C. Loy et al., “Facial landmark detection by deep multi-task learning,” in Proc. 13th Eur. Conf. Comput. Vis. (ECCV), 2014, pp. 94–108.
  38. J. Dai, K. He, and J. Sun, “Instance-Aware Semantic Segmentation via Multi-Task Network Cascades,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 3150–3158.
  39. S. Liu, E. Johns, and A. J. Davison, “End-To-End Multi-Task Learning With Attention,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), June 2019, pp. 1871–1880.
  40. R. Collobert and J. Weston, “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning,” in Proc. 25th Int. Conf. Mach. Learn. (ICML).   PMLR, 2008, pp. 160–167.
  41. Z. Wu, C. Valentini-Botinhao, O. Watts et al., “Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2015, pp. 4460–4464.
  42. J. Worsham and J. Kalita, “Multi-task learning for natural language processing in the 2020s: Where are we going?” Pattern Recognition Letters, vol. 136, pp. 120–126, 2020.
  43. R. Caruana, “Multitask learning: A knowledge-based source of inductive bias,” in Proc. 10th Int. Conf. Mach. Learn. (ICML).   Citeseer, 1993, pp. 41–48.
  44. J. Baxter, “A Bayesian/Information Theoretic Model of Learning to Learn via Multiple Task Sampling,” Machine Learning, vol. 28, pp. 7–39, 1997.
  45. M. Raghu, C. Zhang, J. Kleinberg et al., “Transfusion: Understanding Transfer Learning for Medical Imaging,” Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 32, pp. 3347–3357, 2019.
  46. M. Oquab, L. Bottou, I. Laptev et al., “Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), 2014, pp. 1717–1724.
  47. U. Lopes and J. F. Valiati, “Pre-trained convolutional neural networks as feature extractors for tuberculosis detection,” Computers in Biology and Medicine, vol. 89, pp. 135–143, 2017.
  48. S. Rajaraman, S. K. Antani, M. Poostchi et al., “Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images,” PeerJ, vol. 6, p. e4568, 2018.
  49. Z. Shen, Z. Liu, J. Qin et al., “Partial Is Better Than All: Revisiting Fine-tuning Strategy for Few-shot Learning ,” in Proc. 15th AAAI Conf. Artif. Intell. (AAAI), vol. 35, no. 11, 2021, pp. 9594–9602.
  50. J. Devlin, M.-W. Chang, K. Lee et al., “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), vol. 1, 2019, pp. 4171–4186.
  51. T. Brown, B. Mann, N. Ryder et al., “ Language Models are Few-Shot Learners,” Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, pp. 1877–1901, 2020.
  52. I. Loshchilov and F. Hutter, “Decoupled Weight Decay Regularization,” in Proc. 7th Int. Conf. Learn. Represent. (ICLR), 2018, pp. 3347–3357.
  53. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  54. A. Krizhevsky, “One weird trick for parallelizing convolutional neural networks,” arXiv preprint arXiv:1404.5997, 2014.
  55. M. McCloskey and N. J. Cohen, “Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem,” in Psychology of Learning and Motivation.   Elsevier, 1989, vol. 24, pp. 109–165.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: