MATRIX: Multi-Agent Trajectory Generation with Diverse Contexts (2403.06041v1)
Abstract: Data-driven methods have great advantages in modeling complicated human behavioral dynamics and dealing with many human-robot interaction applications. However, collecting massive and annotated real-world human datasets has been a laborious task, especially for highly interactive scenarios. On the other hand, algorithmic data generation methods are usually limited by their model capacities, making them unable to offer realistic and diverse data needed by various application users. In this work, we study trajectory-level data generation for multi-human or human-robot interaction scenarios and propose a learning-based automatic trajectory generation model, which we call Multi-Agent TRajectory generation with dIverse conteXts (MATRIX). MATRIX is capable of generating interactive human behaviors in realistic diverse contexts. We achieve this goal by modeling the explicit and interpretable objectives so that MATRIX can generate human motions based on diverse destinations and heterogeneous behaviors. We carried out extensive comparison and ablation studies to illustrate the effectiveness of our approach across various metrics. We also presented experiments that demonstrate the capability of MATRIX to serve as data augmentation for imitation-based motion planning.
- J. Van den Berg, M. Lin, and D. Manocha, “Reciprocal velocity obstacles for real-time multi-agent navigation,” in 2008 IEEE International Conference on Robotics and Automation. IEEE, 2008, pp. 1928–1935.
- J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha, “Reciprocal n-body collision avoidance,” in Robotics Research: The 14th International Symposium ISRR. Springer, 2011, pp. 3–19.
- T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVIII 16. Springer, 2020, pp. 683–700.
- C. Choi, J. H. Choi, J. Li, and S. Malla, “Shared cross-modal trajectory prediction for autonomous driving,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 244–253.
- R. Zhou, H. Zhou, H. Gao, M. Tomizuka, J. Li, and Z. Xu, “Grouptron: Dynamic multi-scale graph convolutional networks for group-aware dense crowd trajectory forecasting,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 805–811.
- V. M. Dax, J. Li, E. Sachdeva, N. Agarwal, and M. J. Kochenderfer, “Disentangled neural relational inference for interpretable motion prediction,” IEEE Robotics and Automation Letters, 2023.
- K. Li, Y. Chen, M. Shan, J. Li, S. Worrall, and E. Nebot, “Game theory-based simultaneous prediction and planning for autonomous vehicle navigation in crowded environments,” in IEEE International Conference on Intelligent Transportation Systems (ITSC), 2023.
- H. Chang, Z. Xu, and M. Tomizuka, “Cascade attribute network: Decomposing reinforcement learning control policies using hierarchical neural networks,” IFAC-PapersOnLine, vol. 53, no. 2, pp. 8181–8186, 2020.
- Z. Xu, H. Chang, C. Tang, C. Liu, and M. Tomizuka, “Toward modularization of neural network autonomous driving policy using parallel attribute networks,” in 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2019, pp. 1400–1407.
- Z. Xu, J. Chen, and M. Tomizuka, “Guided policy search model-based reinforcement learning for urban autonomous driving,” arXiv preprint arXiv:2005.03076, 2020.
- J. Chen, Z. Xu, and M. Tomizuka, “End-to-end autonomous driving perception with sequential latent representation learning,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 1999–2006.
- J. Li, F. Yang, M. Tomizuka, and C. Choi, “Evolvegraph: Multi-agent trajectory prediction with dynamic relational reasoning,” in Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Z. Xu and M. Tomizuka, “History encoding representation design for human intention inference,” arXiv preprint arXiv:2106.02222, 2021.
- J. Li, C. Hua, H. Ma, J. Park, V. Dax, and M. J. Kochenderfer, “Multi-agent dynamic relational reasoning for social robot navigation,” arXiv preprint arXiv:2401.12275, 2024.
- K. Mahadevan, J. Chien, N. Brown, Z. Xu, C. Parada, F. Xia, A. Zeng, L. Takayama, and D. Sadigh, “Generative expressive robot behaviors using large language models,” arXiv preprint arXiv:2401.14673, 2024.
- S. Nasiriany, F. Xia, W. Yu, T. Xiao, J. Liang, I. Dasgupta, A. Xie, D. Driess, A. Wahid, Z. Xu, et al., “Pivot: Iterative visual prompting elicits actionable knowledge for vlms,” arXiv preprint arXiv:2402.07872, 2024.
- D. Helbing and P. Molnar, “Social force model for pedestrian dynamics,” Physical review E, vol. 51, no. 5, p. 4282, 1995.
- A. Treuille, S. Cooper, and Z. Popović, “Continuum crowds,” ACM Transactions on Graphics (TOG), vol. 25, no. 3, pp. 1160–1168, 2006.
- L. Sun, P.-Y. Hung, C. Wang, M. Tomizuka, and Z. Xu, “Distributed multi-agent interaction generation with imagined potential games,” arXiv preprint arXiv:2310.01614, 2023.
- C. E. Rasmussen, “Gaussian processes in machine learning,” in Summer school on machine learning. Springer, 2003, pp. 63–71.
- N. Lee and K. M. Kitani, “Predicting wide receiver trajectories in american football,” in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2016, pp. 1–9.
- J. Morton, T. A. Wheeler, and M. J. Kochenderfer, “Analysis of recurrent neural networks for probabilistic modeling of driver behavior,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 5, pp. 1289–1298, 2016.
- A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 4601–4607.
- N. Deo and M. M. Trivedi, “Multi-modal trajectory prediction of surrounding vehicles with maneuver based lstms,” in 2018 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2018, pp. 1179–1184.
- J. Li, H. Ma, and M. Tomizuka, “Conditional generative neural system for probabilistic trajectory prediction,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019, pp. 6150–6156.
- H. Girase, H. Gang, S. Malla, J. Li, A. Kanehara, K. Mangalam, and C. Choi, “Loki: Long-term and key intentions for trajectory prediction,” in International Conference on Computer Vision (ICCV), 2021.
- N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. Torr, and M. Chandraker, “Desire: Distant future prediction in dynamic scenes with interacting agents,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 336–345.
- G. Chen, J. Li, J. Lu, and J. Zhou, “Human trajectory prediction via counterfactual analysis,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 9824–9833.
- J. Li, H. Ma, Z. Zhang, J. Li, and M. Tomizuka, “Spatio-temporal graph dual-attention network for multi-agent prediction and tracking,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 10 556–10 569, 2021.
- A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2255–2264.
- A. Sadeghian, V. Kosaraju, A. Sadeghian, N. Hirose, H. Rezatofighi, and S. Savarese, “Sophie: An attentive gan for predicting paths compliant to social and physical constraints,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1349–1358.
- J. Li, H. Ma, and M. Tomizuka, “Interaction-aware multi-agent tracking and probabilistic behavior prediction via adversarial learning,” in 2019 international conference on robotics and automation (ICRA). IEEE, 2019, pp. 6658–6664.
- T. Gu, G. Chen, J. Li, C. Lin, Y. Rao, J. Zhou, and J. Lu, “Stochastic trajectory prediction via motion indeterminacy diffusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 113–17 122.
- C. Jiang, A. Cornman, C. Park, B. Sapp, Y. Zhou, D. Anguelov, et al., “Motiondiffuser: Controllable multi-agent motion prediction using diffusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 9644–9653.
- G. Barquero, S. Escalera, and C. Palmero, “Belfusion: Latent diffusion for behavior-driven human motion prediction,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 2317–2327.
- W. Mao, C. Xu, Q. Zhu, S. Chen, and Y. Wang, “Leapfrog diffusion model for stochastic trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5517–5526.
- I. Bae, J.-H. Park, and H.-G. Jeon, “Non-probability sampling network for stochastic human trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6477–6487.
- T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in International Conference on Learning Representations (ICLR), 2017.
- S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” in Thirty-second AAAI conference on artificial intelligence, 2018.
- A. Mohamed, K. Qian, M. Elhoseiny, and C. Claudel, “Social-stgcnn: A social spatio-temporal graph convolutional neural network for human trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14 424–14 432.
- Y. Yuan, X. Weng, Y. Ou, and K. M. Kitani, “Agentformer: Agent-aware transformers for socio-temporal multi-agent forecasting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 9813–9823.
- F.-Y. Sun, I. Kauvar, R. Zhang, J. Li, M. J. Kochenderfer, J. Wu, and N. Haber, “Interaction modeling with multiplex attention,” Advances in Neural Information Processing Systems, vol. 35, pp. 20 038–20 050, 2022.
- J. Li, F. Yang, H. Ma, S. Malla, M. Tomizuka, and C. Choi, “Rain: Reinforced hybrid attention inference network for motion forecasting,” in International Conference on Computer Vision (ICCV), 2021.
- H. Hu, Q. Wang, Z. Zhang, Z. Li, and Z. Gao, “Holistic transformer: A joint neural network for trajectory prediction and decision-making of autonomous vehicles,” Pattern Recognition, vol. 141, p. 109592, 2023.
- Y. Chen, B. Ivanovic, and M. Pavone, “Scept: Scene-consistent, policy-based trajectory predictions for planning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 103–17 112.
- K. Mangalam, Y. An, H. Girase, and J. Malik, “From goals, waypoints & paths to long term human trajectory forecasting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15 233–15 242.
- C. Chen, Y. Liu, S. Kreiss, and A. Alahi, “Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 6015–6022.
- C. Chen, S. Hu, P. Nikdel, G. Mori, and M. Savva, “Relational graph learning for crowd navigation,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 10 007–10 013.
- H. Ma, J. Li, R. Hosseini, M. Tomizuka, and C. Choi, “Multi-objective diverse human motion prediction with knowledge distillation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 8161–8171.
- N. Rhinehart, R. McAllister, K. Kitani, and S. Levine, “Precog: Prediction conditioned on goals in visual multi-agent settings,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 2821–2830.
- Y. Chai, B. Sapp, M. Bansal, and D. Anguelov, “Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction,” arXiv preprint arXiv:1910.05449, 2019.
- T. Phan-Minh, E. C. Grigore, F. A. Boulton, O. Beijbom, and E. M. Wolff, “Covernet: Multimodal behavior prediction using trajectory sets,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14 074–14 083.
- H. Zhao, J. Gao, T. Lan, C. Sun, B. Sapp, B. Varadarajan, Y. Shen, Y. Shen, Y. Chai, C. Schmid, et al., “Tnt: Target-driven trajectory prediction,” in Conference on Robot Learning. PMLR, 2021, pp. 895–904.
- J. Gu, C. Sun, and H. Zhao, “Densetnt: End-to-end trajectory prediction from dense goal sets,” in International Conference on Computer Vision (ICCV), 2021, pp. 15 303–15 312.
- K. Mangalam, H. Girase, S. Agarwal, K.-H. Lee, E. Adeli, J. Malik, and A. Gaidon, “It is not the journey but the destination: Endpoint conditioned trajectory prediction,” in Proceedings of the European Conference on Computer Vision (ECCV), August 2020.
- K. Mangalam, Y. An, H. Girase, and J. Malik, “From goals, waypoints & paths to long term human trajectory forecasting,” in Proc. International Conference on Computer Vision (ICCV), Oct. 2021.
- Z. He and R. P. Wildes, “Where are you heading? dynamic trajectory prediction with expert goal examples,” in Proceedings of the International Conference on Computer Vision (ICCV), Oct. 2021.
- S. Pellegrini, A. Ess, K. Schindler, and L. Van Gool, “You’ll never walk alone: Modeling social behavior for multi-target tracking,” in 2009 IEEE 12th International Conference on Computer Vision. IEEE, 2009, pp. 261–268.
- A. Lerner, Y. Chrysanthou, and D. Lischinski, “Crowds by example,” in Computer graphics forum, vol. 26, no. 3. Wiley Online Library, 2007, pp. 655–664.
- Y. Yuan and K. M. Kitani, “Diverse trajectory forecasting with determinantal point processes,” in International Conference on Learning Representations, 2019.
- A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 961–971.