Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distilling Knowledge for Short-to-Long Term Trajectory Prediction (2305.08553v4)

Published 15 May 2023 in cs.CV, cs.AI, cs.LG, and cs.RO

Abstract: Long-term trajectory forecasting is an important and challenging problem in the fields of computer vision, machine learning, and robotics. One fundamental difficulty stands in the evolution of the trajectory that becomes more and more uncertain and unpredictable as the time horizon grows, subsequently increasing the complexity of the problem. To overcome this issue, in this paper, we propose Di-Long, a new method that employs the distillation of a short-term trajectory model forecaster that guides a student network for long-term trajectory prediction during the training process. Given a total sequence length that comprehends the allowed observation for the student network and the complementary target sequence, we let the student and the teacher solve two different related tasks defined over the same full trajectory: the student observes a short sequence and predicts a long trajectory, whereas the teacher observes a longer sequence and predicts the remaining short target trajectory. The teacher's task is less uncertain, and we use its accurate predictions to guide the student through our knowledge distillation framework, reducing long-term future uncertainty. Our experiments show that our proposed Di-Long method is effective for long-term forecasting and achieves state-of-the-art performance on the Intersection Drone Dataset (inD) and the Stanford Drone Dataset (SDD).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  2. L. Ballan, F. Castaldo, A. Alahi, F. A. N. Palmieri, and S. Savarese, “Knowledge transfer for scene-specific motion prediction,” in European Conference on Computer Vision (ECCV), 2016.
  3. A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  4. F. Marchetti, F. Becattini, L. Seidenari, and A. D. Bimbo, “MANTRA: Memory Augmented Networks for Multiple Trajectory Prediction,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  5. A. Rudenko, L. Palmieri, M. Herman, K. M. Kitani, D. M. Gavrila, and K. O. Arras, “Human motion trajectory prediction: a survey,” The International Journal of Robotics Research (IJRR), vol. 39, pp. 895–935, 2019.
  6. M. Monforte, A. Arriandiaga, A. Glover, and C. Bartolozzi, “Where and when: event-based spatiotemporal trajectory prediction from the icub’s point-of-view.” in IEEE International Conference on Robotics and Automation (ICRA), 2020.
  7. A. Cui, S. Casas, K. Wong, S. Suo, and R. Urtasun, “Gorela: Go relative for viewpoint-invariant motion forecasting,” in IEEE International Conference on Robotics and Automation (ICRA), 2023.
  8. G. Camporese, P. Coscia, A. Furnari, G. M. Farinella, and L. Ballan, “Knowledge distillation for action anticipation via label smoothing,” in IAPR International Conference on Pattern Recognition (ICPR), 2020.
  9. N. Osman, G. Camporese, and L. Ballan, “TAMformer: Multi-Modal Transformer with Learned Attention Mask for Early Intent Prediction,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.
  10. K. Mangalam, Y. An, H. Girase, and J. Malik, “From goals, waypoints & paths to long term human trajectory forecasting,” in IEEE/CVF Int’l Conference on Computer Vision (ICCV), 2021.
  11. H. Girase, H. Gang, S. Malla, J. Li, A. Kanehara, K. Mangalam, and C. Choi, “LOKI: Long Term and Key Intentions for Trajectory Prediction,” in IEEE/CVF Int’l Conference on Computer Vision (ICCV), 2021.
  12. L. F. Chiara, P. Coscia, S. Das, S. Calderara, R. Cucchiara, and L. Ballan, “Goal-driven self-attentive recurrent networks for trajectory prediction,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2022.
  13. A. Vaswani, N. M. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems (NIPS), 2017.
  14. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems (NeurIPS), 2020.
  15. Y. Yuan, X. Weng, Y. Ou, and K. Kitani, “AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting,” in IEEE/CVF Int’l Conference on Computer Vision (ICCV), 2021.
  16. L. Franco, L. Placidi, F. Giuliari, I. Hasan, M. Cristani, and F. Galasso, “Under the hood of transformer networks for trajectory forecasting,” Pattern Recognition, vol. 138, p. 109372, 2023.
  17. Y. Huang, H. Bi, Z. Li, T. Mao, and Z. Wang, “STGAT: Modeling Spatial-Temporal Interactions for Human Trajectory Prediction,” IEEE/CVF Int’l Conference on Computer Vision (ICCV), 2019.
  18. G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” in Proc. of the NeurIPS Deep Learning and Representation Learning Workshop, 2015.
  19. P. Zhang, W. Ouyang, P. Zhang, J. Xue, and N. Zheng, “SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  20. C. Yu, X. Ma, J. Ren, H. Zhao, and S. Yi, “Spatio-temporal graph transformer networks for pedestrian trajectory prediction,” in European Conference on Computer Vision (ECCV), 2020.
  21. T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Multi-agent generative trajectory forecasting with heterogeneous data for control,” in European Conference on Computer Vision (ECCV), 2020.
  22. N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, and M. Chandraker, “DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  23. A. Sadeghian, V. Kosaraju, A. R. Sadeghian, N. Hirose, and S. Savarese, “SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  24. S. V. Albrecht, C. Brewitt, J. Wilhelm, B. Gyevnar, F. Eiras, M. Dobre, and S. Ramamoorthy, “Interpretable goal-based prediction and planning for autonomous driving,” in IEEE International Conference on Robotics and Automation (ICRA), 2021.
  25. H. Zhao, J. Gao, T. Lan, C. Sun, B. Sapp, B. Varadarajan, Y. Shen, Y. Shen, Y. Chai, C. Schmid, C. Li, and D. Anguelov, “TNT: Target-driven trajectory prediction,” in International Conference on Robot Learning (CoRL), 2020.
  26. K. Mangalam, H. Girase, S. Agarwal, K.-H. Lee, E. Adeli, J. Malik, and A. Gaidon, “It is not the journey but the destination: Endpoint conditioned trajectory prediction,” European Conference on Computer Vision (ECCV), 2020.
  27. T. Gilles, S. Sabatini, D. Tsishkou, B. Stanciulescu, and F. Moutarde, “GOHOME: Graph-Oriented Heatmap Output for future Motion Estimation,” in IEEE International Conference on Robotics and Automation (ICRA), 2022.
  28. A. Monti, A. Porrello, S. Calderara, P. Coscia, L. Ballan, and R. Cucchiara, “How many observations are enough? knowledge distillation for trajectory forecasting,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  29. Y. Wang, P. Zhang, L. Bai, and J. Xue, “Enhancing mapless trajectory prediction through knowledge distillation,” arXiv preprint arXiv:2306.14177, 2023.
  30. A. Bertugli, S. Calderara, P. Coscia, L. Ballan, and R. Cucchiara, “AC-VRNN: Attentive Conditional-VRNN for Multi-Future Trajectory Prediction,” Computer Vision and Image Understanding (CVIU), vol. 210, p. 103245, 2021.
  31. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015.
  32. A. Mohamed, K. Qian, M. Elhoseiny, and C. Claudel, “Social-STGCNN: A social spatio-temporal graph convolutional neural network for human trajectory prediction,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  33. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” International Conference on Learning Representations (ICLR), 2021.
  34. A. Bhattacharyya, M. Hanselmann, M. Fritz, B. Schiele, and C. N. Straehle, “Conditional flow variational autoencoders for structured sequence prediction,” Proc. of the NeurIPS Bayesian Deep Learning Workshop, 2019.
  35. N. Deo and M. M. Trivedi, “Trajectory forecasts in unknown environments conditioned on grid-based plans,” ArXiv, vol. abs/2001.00735, 2020.
  36. J. Liang, L. Jiang, and A. Hauptmann, “SimAug: Learning Robust Representations from 3D Simulation for Pedestrian Trajectory Prediction in Unseen Cameras,” European Conference on Computer Vision (ECCV), 2020.
  37. J. Bock, R. Krajewski, T. Moers, S. Runde, L. Vater, and L. Eckstein, “The ind dataset: A drone dataset of naturalistic road user trajectories at german intersections,” in IEEE Intelligent Vehicles Symposium (IV), 2019.
  38. A. Robicquet, A. Sadeghian, A. Alahi, and S. Savarese, “Learning social etiquette: Human trajectory understanding in crowded scenes,” in European Conference on Computer Vision (ECCV), 2016.
  39. B. Ivanovic and M. Pavone, “The trajectron: Probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs,” IEEE/CVF Int’l Conference on Computer Vision (ICCV), 2018.
  40. Y. Xu, L. Chambon, M. Chen, A. Alahi, M. Cord, P. Perez et al., “Towards motion forecasting with real-world perception inputs: Are end-to-end approaches competitive?” in IEEE International Conference on Robotics and Automation (ICRA), 2024.
  41. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations (ICLR), 2015.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com