IA-LSTM: Interaction-Aware LSTM for Pedestrian Trajectory Prediction (2311.15193v2)
Abstract: Predicting the trajectory of pedestrians in crowd scenarios is indispensable in self-driving or autonomous mobile robot field because estimating the future locations of pedestrians around is beneficial for policy decision to avoid collision. It is a challenging issue because humans have different walking motions, and the interactions between humans and objects in the current environment, especially between humans themselves, are complex. Previous researchers focused on how to model human-human interactions but neglected the relative importance of interactions. To address this issue, a novel mechanism based on correntropy is introduced. The proposed mechanism not only can measure the relative importance of human-human interactions but also can build personal space for each pedestrian. An interaction module including this data-driven mechanism is further proposed. In the proposed module, the data-driven mechanism can effectively extract the feature representations of dynamic human-human interactions in the scene and calculate the corresponding weights to represent the importance of different interactions. To share such social messages among pedestrians, an interaction-aware architecture based on long short-term memory network for trajectory prediction is designed. Experiments are conducted on two public datasets. Experimental results demonstrate that our model can achieve better performance than several latest methods with good performance.
- A. Lerner, Y. Chrysanthou, and D. Lischinski, “Crowds by example,” Computer Graphics Forum, vol. 26, no. 3, pp. 655–664, 2010.
- Y. Luo, P. Cai, A. Bera, D. Hsu, W. S. Lee, and D. Manocha, “Porca: Modeling and planning for autonomous driving among many pedestrians,” IEEE Robotics & Automation Letters, vol. PP, no. 99, pp. 1–1, 2018.
- S. Pellegrini, A. Ess, K. Schindler, and L. J. V. Gool, “You’ll never walk alone: Modeling social behavior for multi-target tracking,” in IEEE International Conference on Computer Vision, 2009, pp. 261–268.
- R. P. D. Vivacqua, M. Bertozzi, P. Cerri, F. N. Martins, and R. F. Vassallo, “Self-localization based on visual lane marking maps: An accurate low-cost approach for autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, vol. PP, no. 99, pp. 1–16, 2017.
- S. Thrun, M. Bennewitz, W. Burgard, A. B. Cremers, F. Dellaert, D. Fox, D. Hahnel, C. Rosenberg, N. Roy, and J. Schulte, “Minerva: A second-generation museum tour-guide robot,” in IEEE International Conference on Robotics & Automation, vol. 3, 2002, pp. 1999–2005.
- J. J. Leonard and H. F. Durrant-Whyte, “Application of multi-target tracking to sonar-based mobile robot navigation,” in IEEE Conference on Decision & Control, vol. 29, 1990, pp. 3118–3123.
- P. Trautman and A. Krause, “Unfreezing the robot: Navigation in dense, interacting crowds,” in IEEE/RSJ International Conference on Intelligent Robots & Systems, 2010, pp. 797–803.
- Z. Wan, C. Jiang, M. Fahad, Z. Ni, Y. Guo, and H. He, “Robot-assisted pedestrian regulation based on deep reinforcement learning,” IEEE Transactions on Cybernetics, vol. 50, no. 4, pp. 1669–1682, 2020.
- Y. Feng, Y. Yuan, and X. Lu, “Person reidentification via unsupervised cross-view metric learning,” IEEE Transactions on Cybernetics, vol. 51, no. 4, pp. 1849–1859, 2021.
- X. Cao, L. Ren, and C. Sun, “Dynamic target tracking control of autonomous underwater vehicle based on trajectory prediction,” IEEE Transactions on Cybernetics, pp. 1–14, 2022.
- P. Li, S. Wang, H. Yang, and H. Zhao, “Trajectory tracking and obstacle avoidance for wheeled mobile robots based on empc with an adaptive prediction horizon,” IEEE Transactions on Cybernetics, pp. 1–10, 2021.
- N. Shafiee, T. Padir, and E. Elhamifar, “Introvert: Human trajectory prediction via conditional 3d attention,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16 815–16 825.
- M. Moussaïd, N. Perozo, S. Garnier, D. Helbing, and G. Theraulaz, “The walking behaviour of pedestrian social groups and its impact on crowd dynamics,” PloS one, vol. 5, no. 4, p. e10047, 2010.
- A. Gupta, J. Johnson, L. Feifei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” in Computer Vision & Pattern Recognition, 2018, pp. 2255–2264.
- D. Helbing and P. Molnar, “Social force model for pedestrian dynamics,” Physical Review E, vol. 51, no. 5, pp. 4282–4286, 1995.
- D. Yang, L. Li, K. Redmill, and Ü. Özgüner, “Top-view trajectories: A pedestrian dataset of vehicle-crowd interaction from controlled experiments and crowded campus,” in 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2019, pp. 899–904.
- K. Yamaguchi, A. C. Berg, L. E. Ortiz, and T. L. Berg, “Who are you with and where are you going?” in CVPR 2011. IEEE, 2011, pp. 1345–1352.
- A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, and S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in Computer Vision & Pattern Recognition, 2016, pp. 961–971.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv: Neural and Evolutionary Computing, vol. 1412.3555, 2014.
- T. Fernando, S. Denman, A. Mcfadyen, S. Sridharan, and C. Fookes, “Tree memory networks for modelling long-term temporal dependencies,” Neurocomputing, vol. 304, pp. 64–81, 2018.
- K. M. Kitani, T. Okabe, Y. Sato, and A. Sugimoto, “Fast unsupervised ego-action learning for first-person sports videos,” in Computer Vision & Pattern Recognition, 2011, pp. 3241–3248.
- M. S. Ryoo, T. J. Fuchs, L. Xia, J. K. Aggarwal, and L. Matthies, “Early recognition of human activities from first-person videos using onset representations,” in Computer Vision & Pattern Recognition, 2015.
- N. Srivastava, E. Mansimov, and R. Salakhutdinov, “Unsupervised learning of video representations using lstms,” in International Conference on Machine Learning, 2015, pp. 843–852.
- C. Vondrick, H. Pirsiavash, and A. Torralba, “Anticipating the future by watching unlabeled video.” Computer Vision & Pattern Recognition, 2015.
- A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in International Conference on Robotics and Automation, 2018, pp. 1–7.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” Neural Information Processing Systems, pp. 5998–6008, 2017.
- A. Graves and N. Jaitly, “Towards end-to-end speech recognition with recurrent neural networks,” in International conference on machine learning. PMLR, 2014, pp. 1764–1772.
- A. Graves, “Generating sequences with recurrent neural networks,” arXiv preprint arXiv:1308.0850, 2013.
- Y. Wu, L. Wang, S. Zhou, J. Duan, G. Hua, and W. Tang, “Multi-stream representation learning for pedestrian trajectory prediction,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 3, 2023, pp. 2875–2882.
- L. Shi, L. Wang, S. Zhou, and G. Hua, “Trajectory unified transformer for pedestrian trajectory prediction,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 9675–9684.
- H. Zhou, X. Yang, M. Fan, H. Huang, D. Ren, and H. Xia, “Static-dynamic global graph representation for pedestrian trajectory prediction,” Knowledge-Based Systems, vol. 277, p. 110775, 2023.
- N. Bhargava, S. Chaudhuri, and G. Seetharaman, “Linear cyclic pursuit based prediction of personal space violation in surveillance video,” in 2013 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), 2013, pp. 1–5.
- T. Li, H. Chang, M. Wang, B. Ni, R. Hong, and S. Yan, “Crowded scene analysis: A survey,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 3, pp. 367–386, 2015.
- A. Sardar, M. Joosse, A. Weiss, and V. Evers, “Don’t stand so close to me: Users’ attitudinal and behavioral responses to personal space invasion by robots,” in 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2012, pp. 229–230.
- Z. Bingchen, G. Weimin, Z. Hengyu, C. Huachun, and W. Yanqun, “Research on relationship of passenger’s behavior and space,” in 2011 IEEE 2nd International Conference on Computing, Control and Industrial Engineering, vol. 1, 2011, pp. 82–85.
- M. Golchoubian, M. Ghafurian, K. Dautenhahn, and N. L. Azad, “Pedestrian trajectory prediction in pedestrian-vehicle mixed environments: A systematic review,” IEEE Transactions on Intelligent Transportation Systems, 2023.
- X. Zhang, P. Angeloudis, and Y. Demiris, “Dual-branch spatio-temporal graph neural networks for pedestrian trajectory prediction,” Pattern Recognition, vol. 142, p. 109633, 2023.
- X. Zhong, X. Yan, Z. Yang, W. Huang, K. Jiang, R. W. Liu, and Z. Wang, “Visual exposes you: Pedestrian trajectory prediction meets visual intention,” IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 9, pp. 9390–9400, 2023.
- S. Pellegrini, A. Ess, and L. Van Gool, “Improving data association by joint modeling of pedestrian trajectories and groupings,” in European Conference on Computer Vision, vol. 6311, 2010, pp. 452–465.
- N. Schneider and D. M. Gavrila, “Pedestrian path prediction with recursive bayesian filters: A comparative study,” in German Conference on Pattern Recognition, vol. 8142, 2013, pp. 174–183.
- L. Ballan, F. Castaldo, A. Alahi, F. Palmieri, and S. Savarese, “Knowledge transfer for scene-specific motion prediction,” in European Conference on Computer Vision, 2016, pp. 697–713.
- J. F. P. Kooij, N. Schneider, F. Flohr, and D. M. Gavrila, “Context-based pedestrian path prediction,” European Conference on Computer Vision, vol. 8694, pp. 618–633, 2014.
- S. Huang, X. Li, Z. Zhang, Z. He, F. Wu, W. Liu, J. Tang, and Y. Zhuang, “Deep learning driven visual path prediction from a single image,” IEEE Transactions on Image Processing, vol. 25, no. 12, pp. 5892–5904, 2016.
- J. Walker, A. Gupta, and M. Hebert, “Patch to the future: Unsupervised visual prediction,” in Computer Vision & Pattern Recognition, 2014, pp. 2224–2231.
- X. Dan, S. Todorovic, and S. C. Zhu, “Inferring ”dark matter” and ”dark energy” from videos,” in IEEE International Conference on Computer Vision, 2013, pp. 2224–2231.
- Y. Shuai, H. Li, and X. Wang, “Pedestrian behavior understanding and prediction with deep neural networks,” in European Conference on Computer Vision, 2016, pp. 263–279.
- W. Xue, B. Lian, J. Fan, P. Kolaric, T. Chai, and F. L. Lewis, “Inverse reinforcement q-learning through expert imitation for discrete-time systems,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–14, 2021.
- A. Y. Ng and S. Russell, “Algorithms for inverse reinforcement learning,” in International Conference on Machine Learning, vol. 67, no. 2, 2000, pp. 663–670.
- B. D. Ziebart, N. D. Ratliff, G. Gallagher, C. Mertz, K. M. Peterson, J. A. Bagnell, M. Hebert, A. K. Dey, and S. S. Srinivasa, “Planning-based prediction for pedestrians,” in IEEE/RSJ International Conference on Intelligent Robots & Systems, 2009, pp. 3931–3936.
- K. M. Kitani, B. D. Ziebart, J. A. Bagnell, and M. Hebert, “Activity forecasting,” in European Conference on Computer Vision, 2012, pp. 201–214.
- N. Lee and K. M. Kitani, “Predicting wide receiver trajectories in american football,” in Applications of Computer Vision, 2016, pp. 1–9.
- W. Ma, D. Huang, N. Lee, and K. M. Kitani, “Forecasting interactive dynamics of pedestrians with fictitious play,” in Computer Vision & Pattern Recognition, 2017, pp. 4636–4644.
- I. Bae and H.-G. Jeon, “A set of control points conditioned pedestrian trajectory prediction,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 5, 2023, pp. 6155–6165.
- R. Liang, Y. Li, J. Zhou, and X. Li, “Stglow: a flow-based generative framework with dual-graphormer for pedestrian trajectory prediction,” IEEE transactions on neural networks and learning systems, 2023.
- X. Cao, L. Ren, and C. Sun, “Dynamic target tracking control of autonomous underwater vehicle based on trajectory prediction,” IEEE Transactions on Cybernetics, vol. 53, no. 3, pp. 1968–1981, 2023.
- X. Na and D. J. Cole, “Experimental evaluation of a game-theoretic human driver steering control model,” IEEE Transactions on Cybernetics, vol. 53, no. 8, pp. 4791–4804, 2023.
- P. Li, S. Wang, H. Yang, and H. Zhao, “Trajectory tracking and obstacle avoidance for wheeled mobile robots based on empc with an adaptive prediction horizon,” IEEE Transactions on Cybernetics, vol. 52, no. 12, pp. 13 536–13 545, 2022.
- H. Gao, H. An, W. Lin, X. Yu, and J. Qiu, “Trajectory tracking of variable centroid objects based on fusion of vision and force perception,” IEEE Transactions on Cybernetics, vol. 53, no. 12, pp. 7957–7965, 2023.
- Z. Xiao, H. Fang, H. Jiang, J. Bai, V. Havyarimana, H. Chen, and L. Jiao, “Understanding private car aggregation effect via spatio-temporal analysis of trajectory data,” IEEE Transactions on Cybernetics, vol. 53, no. 4, pp. 2346–2357, 2023.
- W. Liu, P. P. Pokharel, and J. C. Principe, “Correntropy: Properties and applications in non-gaussian signal processing,” IEEE Transactions on Signal Processing, vol. 55, no. 11, pp. 5286–5298, 2007.
- I. Santamaría, P. P. Pokharel, and J. C. Principe, “Generalized correlation function: Definition, properties, and application to blind equalization,” IEEE Transactions on Signal Processing, vol. 54, no. 6, pp. 2187–2197, 2006.
- J. Príncipe, “Information theoretic learning,” in International Workshop on Pattern Recognition in Information Systems, 2010.
- W. Shi, Y. Gong, X. Tao, and N. Zheng, “Training dcnn by combining max-margin, max-correlation objectives, and correntropy loss for multilabel image classification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 7, pp. 2896–2908, 2018.
- A. Singh, R. Pokharel, and J. Principe, “The c-loss function for pattern classification,” Pattern Recognition, vol. 47, no. 1, pp. 441–453, 2014.
- N. M. Syed, J. Principe, and P. Pardalos, “Correntropy in data classification,” Springer Proceedings in Mathematics and Statistics, vol. 20, pp. 81–117, 01 2012.
- X. Chen, Y. Jian, J. Liang, and Q. Ye, “Recursive robust least squares support vector regression based on maximum correntropy criterion,” Neurocomputing, vol. 97, no. Complete, pp. 63–73, 2012.
- Y. Feng, X. Huang, S. Lei, Y. Yang, and J. A. K. Suykens, “Learning with the maximum correntropy criterion induced losses for regression,” Journal of Machine Learning Research, vol. 16, no. 1, pp. 993–1034, 2015.
- L. Chen, Q. Hua, J. Zhao, B. Chen, and J. C. Principe, “Efficient and robust deep learning with correntropy-induced loss function,” Neural Computing & Applications, vol. 27, no. 4, pp. 1019–1031, 2016.
- J. Xu and J. C. Principe, “A pitch detector based on a generalized correlation function,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 8, pp. 1420–1432, 2008.
- B. Chen, X. Lei, J. Liang, N. Zheng, and J. C. Principe, “Steady-state mean-square error analysis for adaptive filtering under the maximum correntropy criterion,” IEEE Signal Processing Letters, vol. 21, no. 7, pp. 880–884, 2014.
- S. Zhao, B. Chen, and J. C. Principe, “Kernel adaptive filtering with maximum correntropy criterion,” in International Joint Conference on Neural Networks, 2011, pp. 2012–2017.
- B. Chen, L. Xing, X. Wang, J. Qin, and N. Zheng, “Robust learning with kernel mean p𝑝pitalic_p-power error loss,” IEEE Transactions on Cybernetics, vol. PP, no. 99, pp. 1–13, 2018.
- H. Ran, H. Bao-Gang, Z. Wei-Shi, and K. Xiang-Wei, “Robust principal component analysis based on maximum correntropy criterion,” IEEE Transactions on Image Processing, vol. 20, no. 6, pp. 1485–1494, 2011.
- B. Chen, Y. Xie, X. Wang, Z. Yuan, P. Ren, and J. Qin, “Multikernel correntropy for robust learning,” IEEE Transactions on Cybernetics, 2021.
- B. Du, T. Xinyao, Z. Wang, L. Zhang, and D. Tao, “Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion,” IEEE Transactions on Cybernetics, vol. 49, no. 4, pp. 1440–1453, 2019.
- Y. Wang, Y. Y. Tang, and L. Li, “Correntropy matching pursuit with application to robust digit and face recognition,” IEEE Transactions on Cybernetics, vol. 47, no. 6, pp. 1354–1366, 2017.
- K. Xiong, H. H. C. Iu, and S. Wang, “Kernel correntropy conjugate gradient algorithms based on half-quadratic optimization,” IEEE Transactions on Cybernetics, vol. 51, no. 11, pp. 5497–5510, 2021.
- T. Yang and L. O. Chua, “Implementing back-propagation- through-time learning algorithm using cellular neural networks,” International Journal of Bifurcation & Chaos, 1999.
- R. J. Williams and D. Zipser, “Experimental analysis of the real-time recurrent learning algorithm,” Connection Science, vol. 1, no. 1, pp. 87–111, 1989.
- T. Mikolov, M. Karafiat, L. Burget, J. Cernocký, and S. Khudanpur, “Recurrent neural network based language model,” pp. 1045–1048, 2010.
- J. Chorowski, D. Bahdanau, K. Cho, and Y. Bengio, “End-to-end continuous speech recognition using attention-based recurrent nn: First results,” ArXiv: Neural and Evolutionary Computing, 2014.
- J. Chung, K. Kastner, L. Dinh, K. Goel, A. C. Courville, and Y. Bengio, “A recurrent latent variable model for sequential data,” Neural Information Processing Systems, pp. 2980–2988, 2015.
- A. Graves, A. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in IEEE International Conference on Acoustics, Speech, and Signal Processing, 2013, pp. 6645–6649.
- T. Young, D. Hazarika, S. Poria, and E. Cambria, “Recent trends in deep learning based natural language processing.” ArXiv: Computation and Language, 2017.
- S. Z. Tajalli, A. Kavousi-Fard, M. Mardaneh, A. Khosravi, and R. Razavi-Far, “Uncertainty-aware management of smart grids using cloud-based lstm-prediction interval,” IEEE Transactions on Cybernetics, pp. 1–14, 2021.
- X. Xu and M. Yoneda, “Multitask air-quality prediction based on lstm-autoencoder model,” IEEE Transactions on Cybernetics, vol. 51, no. 5, pp. 2577–2586, 2021.
- Y. Han, W. Qi, N. Ding, and Z. Geng, “Short-time wavelet entropy integrating improved lstm for fault diagnosis of modular multilevel converter,” IEEE Transactions on Cybernetics, pp. 1–9, 2021.
- A. V. Den Oord, N. Kalchbrenner, and K. Kavukcuoglu, “Pixel recurrent neural networks,” in International Conference on Machine Learning, 2016, pp. 1747–1756.
- A. Karpathy and F. F. Li, “Deep visual-semantic alignments for generating image descriptions,” in Computer Vision & Pattern Recognition, 2015, pp. 3128–3137.
- M. Liu, H. Hu, L. Li, Y. Yu, and W. Guan, “Chinese image caption generation via visual attention and topic modeling,” IEEE Transactions on Cybernetics, pp. 1–11, 2020.
- O. Wu, T. Yang, M. Li, and M. Li, “Two-level lstm for sentiment analysis with lexicon embedding and polar flipping,” IEEE Transactions on Cybernetics, pp. 1–13, 2020.
- I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” neural information processing systems, pp. 3104–3112, 2014.
- N. Bisagno, B. Zhang, and N. Conci, “Group lstm: Group trajectory prediction in crowded scenarios,” Springer, Cham, 2018.
- P. Zhang, W. Ouyang, P. Zhang, J. Xue, and N. Zheng, “Sr-lstm: State refinement for lstm towards pedestrian trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12 085–12 094.
- Y. Xu, J. Yang, and S. Du, “Cf-lstm: Cascaded feature-based long short-term networks for predicting pedestrian trajectory.” in National Conference on Artificial Intelligence, 2020.
- M. Luber, J. A. Stork, G. D. Tipaldi, and O. A. Kai, “People tracking with human motion predictions from social forces,” in IEEE International Conference on Robotics & Automation, 2010, pp. 464–469.
- R. Mehran, A. Oyama, and M. Shah, “Abnormal crowd behavior detection using social force model,” in IEEE Conference on Computer Vision & Pattern Recognition, 2009, pp. 935–942.
- Y. Zhu, H. Zhao, X. Zeng, and B. Chen, “Robust generalized maximum correntropy criterion algorithms for active noise control,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1282–1292, 2020.
- L. Gao, X. Li, D. Bi, L. Peng, X. Xie, and Y. Xie, “Robust tensor recovery in impulsive noise based on correntropy and hybrid tensor sparsity,” IEEE Transactions on Circuits and Systems II: Express Briefs, pp. 1–1, 2021.
- H. Zhao, D. Liu, and S. Lv, “Robust maximum correntropy criterion subband adaptive filter algorithm for impulsive noise and noisy input,” IEEE Transactions on Circuits and Systems II: Express Briefs, pp. 1–1, 2021.
- W. Shi and Y. Li, “A shrinkage correntropy based algorithm under impulsive noise environment,” in 2019 6th International Conference on Information, Cybernetics, and Computational Social Systems (ICCSS), 2019, pp. 244–247.
- L. Lealtaixe, M. Fenzi, A. Kuznetsova, B. Rosenhahn, and S. Savarese, “Learning an image-based motion context for multiple people tracking,” in Computer Vision & Pattern Recognition, 2014, pp. 3542–3549.
- J. Kivinen, A. J. Smola, and R. C. Williamson, “Online learning with kernels,” IEEE Transactions on Signal Processing, vol. 52, no. 8, pp. 2165–2176, 2004.
- C. K. I. Williams, “Learning with kernels: Support vector machines, regularization, optimization, and beyond,” Publications of the American Statistical Association, vol. 98, no. 462, pp. 489–489, 2003.
- I. Hasan, F. Setti, T. Tsesmelis, A. Del Bue, F. Galasso, and M. Cristani, “Mx-lstm: mixing tracklets and vislets to jointly forecast trajectories and head poses,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6067–6076.