Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation (2401.12275v2)

Published 22 Jan 2024 in cs.RO, cs.AI, cs.CV, cs.LG, and cs.MA

Abstract: Social robot navigation can be helpful in various contexts of daily life but requires safe human-robot interactions and efficient trajectory planning. While modeling pairwise relations has been widely studied in multi-agent interacting systems, the ability to capture larger-scale group-wise activities is limited. In this paper, we propose a systematic relational reasoning approach with explicit inference of the underlying dynamically evolving relational structures, and we demonstrate its effectiveness for multi-agent trajectory prediction and social robot navigation. In addition to the edges between pairs of nodes (i.e., agents), we propose to infer hyperedges that adaptively connect multiple nodes to enable group-wise reasoning in an unsupervised manner. Our approach infers dynamically evolving relation graphs and hypergraphs to capture the evolution of relations, which the trajectory predictor employs to generate future states. Meanwhile, we propose to regularize the sharpness and sparsity of the learned relations and the smoothness of the relation evolution, which proves to enhance training stability and model performance. The proposed approach is validated on synthetic crowd simulations and real-world benchmark datasets. Experiments demonstrate that the approach infers reasonable relations and achieves state-of-the-art prediction performance. In addition, we present a deep reinforcement learning (DRL) framework for social robot navigation, which incorporates relational reasoning and trajectory prediction systematically. In a group-based crowd simulation, our method outperforms the strongest baseline by a significant margin in terms of safety, efficiency, and social compliance in dense, interactive scenarios. We also demonstrate the practical applicability of our method with real-world robot experiments. The code and videos can be found at https://relational-reasoning-nav.github.io/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. A. Rudenko, L. Palmieri, M. Herman, K. M. Kitani, D. M. Gavrila, and K. O. Arras, “Human motion trajectory prediction: A survey,” International Journal of Robotics Research, vol. 39, no. 8, pp. 895–935, 2020.
  2. P. Kothari, S. Kreiss, and A. Alahi, “Human trajectory forecasting in crowds: A deep learning perspective,” IEEE Transactions on Intelligent Transportation Systems, 2021.
  3. A. Pandey, S. Pandey, and D. Parhi, “Mobile robot navigation and obstacle avoidance techniques: A review,” Int Rob Auto J, vol. 2, no. 3, p. 00022, 2017.
  4. T. Kipf, E. Fetaya, K.-C. Wang, M. Welling, and R. Zemel, “Neural relational inference for interacting systems,” in International Conference on Machine Learning (ICML).   PMLR, 2018, pp. 2688–2697.
  5. J. Li, F. Yang, M. Tomizuka, and C. Choi, “Evolvegraph: Multi-agent trajectory prediction with dynamic relational reasoning,” Advances in Neural Information Processing Systems (NeurIPS), pp. 19 783–19 794, 2020.
  6. C. Graber and A. G. Schwing, “Dynamic neural relational inference,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR).   IEEE, 2020, pp. 8510–8519.
  7. J. Li, F. Yang, H. Ma, S. Malla, M. Tomizuka, and C. Choi, “Rain: Reinforced hybrid attention inference network for motion forecasting,” in International Conference on Computer Vision (ICCV), 2021, pp. 16 096–16 106.
  8. L.-F. Wu, Q. Wang, M. Jian, Y. Qiao, and B.-X. Zhao, “A comprehensive review of group activity recognition in videos,” International Journal of Automation and Computing, vol. 18, no. 3, pp. 334–350, 2021.
  9. N. Bisagno, B. Zhang, and N. Conci, “Group lstm: Group trajectory prediction in crowded scenarios,” in European Conference on Computer Vision (ECCV) Workshops, 2018.
  10. R. Zhou, H. Gao, H. Zhou, M. Tomizuka, J. Li, and Z. Xu, “Grouptron: Dynamic multi-scale graph convolutional networks for group-aware dense crowd trajectory forecasting,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2022.
  11. B. Zhou, X. Tang, and X. Wang, “Coherent filtering: Detecting coherent motions from crowd clutters,” in European Conference on Computer Vision (ECCV).   Springer, 2012, pp. 857–871.
  12. C. Xu, M. Li, Z. Ni, Y. Zhang, and S. Chen, “Groupnet: Multiscale hypergraph neural networks for trajectory prediction with relational reasoning,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 6498–6507.
  13. S. Liu, P. Chang, Z. Huang, N. Chakraborty, K. Hong, W. Liang, D. L. McPherson, J. Geng, and K. Driggs-Campbell, “Intention aware robot crowd navigation with attention-based interaction graph,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 12 015–12 021.
  14. T. Gu, G. Chen, J. Li, C. Lin, Y. Rao, J. Zhou, and J. Lu, “Stochastic trajectory prediction via motion indeterminacy diffusion,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 17 113–17 122.
  15. I. Bae and H.-G. Jeon, “A set of control points conditioned pedestrian trajectory prediction,” in AAAI Conference on Artificial Intelligence (AAAI), vol. 37, no. 5, 2023, pp. 6155–6165.
  16. I. Bae, J. Oh, and H.-G. Jeon, “Eigentrajectory: Low-rank descriptors for multi-modal trajectory forecasting,” in International Conference on Computer Vision (ICCV), 2023, pp. 10 017–10 029.
  17. Y. Huang, J. Du, Z. Yang, Z. Zhou, L. Zhang, and H. Chen, “A survey on trajectory-prediction methods for autonomous driving,” IEEE Transactions on Intelligent Vehicles, 2022.
  18. D. Helbing and P. Molnar, “Social force model for pedestrian dynamics,” Physical Review E, vol. 51, no. 5, p. 4282, 1995.
  19. R. Mehran, A. Oyama, and M. Shah, “Abnormal crowd behavior detection using social force model,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR).   IEEE, 2009, pp. 935–942.
  20. J. v. d. Berg, S. J. Guy, M. Lin, and D. Manocha, “Reciprocal n-body collision avoidance,” in Robotics Research.   Springer, 2011, pp. 3–19.
  21. J. Nilsson, J. Fredriksson, and E. Coelingh, “Rule-based highway maneuver intention recognition,” in IEEE International Conference on Intelligent Transportation Systems (ITSC).   IEEE, 2015, pp. 950–955.
  22. X. Wang, K. T. Ma, G.-W. Ng, and W. E. L. Grimson, “Trajectory analysis and semantic region modeling using nonparametric hierarchical Bayesian models,” International Journal of Computer Vision, vol. 95, no. 3, pp. 287–312, 2011.
  23. J. Schulz, C. Hubmann, J. Löchner, and D. Burschka, “Multiple model unscented kalman filtering in dynamic Bayesian networks for intention estimation and trajectory prediction,” in IEEE International Conference on Intelligent Transportation Systems (ITSC).   IEEE, 2018, pp. 1467–1474.
  24. A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social LSTM: Human trajectory prediction in crowded spaces,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 961–971.
  25. T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data,” in European Conference on Computer Vision (ECCV).   Springer, 2020, pp. 683–700.
  26. A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 2255–2264.
  27. V. Kosaraju, A. Sadeghian, R. Martín-Martín, I. Reid, H. Rezatofighi, and S. Savarese, “Social-bigat: Multimodal trajectory forecasting using bicycle-gan and graph attention networks,” in Advances in Neural Information Processing Systems (NeurIPS), 2019.
  28. J. Li, H. Ma, and M. Tomizuka, “Conditional generative neural system for probabilistic trajectory prediction,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2019, pp. 6150–6156.
  29. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  30. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  31. Y. Yuan, X. Weng, Y. Ou, and K. M. Kitani, “Agentformer: Agent-aware transformers for socio-temporal multi-agent forecasting,” in International Conference on Computer Vision (ICCV), 2021, pp. 9813–9823.
  32. F. Giuliari, I. Hasan, M. Cristani, and F. Galasso, “Transformer networks for trajectory forecasting,” in International Conference on Pattern Recognition (ICPR).   IEEE, 2021, pp. 10 335–10 342.
  33. J. Ngiam, V. Vasudevan, B. Caine, Z. Zhang, H.-T. L. Chiang, J. Ling, R. Roelofs, A. Bewley, C. Liu, A. Venugopal et al., “Scene transformer: A unified architecture for predicting future trajectories of multiple agents,” in International Conference on Learning Representations (ICLR), 2021.
  34. S. Shi, L. Jiang, D. Dai, and B. Schiele, “Motion transformer with global intention localization and local movement refinement,” Advances in Neural Information Processing Systems (NeurIPS), vol. 35, pp. 6531–6543, 2022.
  35. A. Seff, B. Cera, D. Chen, M. Ng, A. Zhou, N. Nayakanti, K. S. Refaat, R. Al-Rfou, and B. Sapp, “Motionlm: Multi-agent motion forecasting as language modeling,” in International Conference on Computer Vision (ICCV), 2023, pp. 8579–8590.
  36. C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Steinfeld, and J. Oh, “Core challenges of social robot navigation: A survey,” ACM Trans. on Human-Robot Interaction, vol. 12, no. 3, pp. 1–39, 2023.
  37. A. Turnwald and D. Wollherr, “Human-like motion planning based on game theoretic decision making,” International Journal of Social Robotics, vol. 11, pp. 151–170, 2019.
  38. D. Fridovich-Keil, E. Ratner, L. Peters, A. D. Dragan, and C. J. Tomlin, “Efficient iterative linear-quadratic approximations for nonlinear multi-player general-sum differential games,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 1475–1481.
  39. Y. F. Chen, M. Everett, M. Liu, and J. P. How, “Socially aware motion planning with deep reinforcement learning,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2017, pp. 1343–1350.
  40. Y. F. Chen, M. Liu, M. Everett, and J. P. How, “Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2017, pp. 285–292.
  41. L. Liu, D. Dugas, G. Cesari, R. Siegwart, and R. Dubé, “Robot navigation in crowded environments using deep reinforcement learning,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 5671–5677.
  42. K. Katyal, Y. Gao, J. Markowitz, S. Pohland, C. Rivera, I.-J. Wang, and C.-M. Huang, “Learning a group-aware policy for robot navigation,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 11 328–11 335.
  43. A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, and P. Battaglia, “Learning to simulate complex physics with graph networks,” in International Conference on Machine Learning (ICML), 2020, pp. 8459–8468.
  44. P. Battaglia, R. Pascanu, M. Lai, D. Jimenez Rezende, and K. Kavukcuoglu, “Interaction networks for learning about objects, relations and physics,” in Advances in Neural Information Processing Systems (NeurIPS), 2016.
  45. C. Gao, J. Xu, Y. Zou, and J.-B. Huang, “DRG: Dual relation graph for human-object interaction detection,” in European Conference on Computer Vision (ECCV), 2020, pp. 696–712.
  46. M. Perez, J. Liu, and A. C. Kot, “Skeleton-based relational reasoning for group activity analysis,” Pattern Recognition, vol. 122, p. 108360, 2022.
  47. X. Su, S. Xue, F. Liu, J. Wu, J. Yang, C. Zhou, W. Hu, C. Paris, S. Nepal, D. Jin et al., “A comprehensive survey on community detection with deep learning,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
  48. N. Deo and M. M. Trivedi, “Convolutional social pooling for vehicle trajectory prediction,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 1468–1476.
  49. Z. Niu, G. Zhong, and H. Yu, “A review on the attention mechanism of deep learning,” Neurocomputing, vol. 452, pp. 48–62, 2021.
  50. A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2018, pp. 4601–4607.
  51. C. Yu, X. Ma, J. Ren, H. Zhao, and S. Yi, “Spatio-temporal graph transformer networks for pedestrian trajectory prediction,” in European Conference on Computer Vision (ECCV).   Springer, 2020, pp. 507–523.
  52. P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner et al., “Relational inductive biases, deep learning, and graph networks,” arXiv preprint arXiv:1806.01261, 2018.
  53. K. Li, M. Shan, K. Narula, S. Worrall, and E. Nebot, “Socially aware crowd navigation with multimodal pedestrian trajectory prediction for autonomous vehicles,” in IEEE International Conference on Intelligent Transportation Systems (ITSC).   IEEE, 2020, pp. 1–8.
  54. A. J. Sathyamoorthy, J. Liang, U. Patel, T. Guan, R. Chandra, and D. Manocha, “Densecavoid: Real-time navigation in dense crowds using anticipatory behaviors,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 11 345–11 352.
  55. K. D. Katyal, G. D. Hager, and C.-M. Huang, “Intent-aware pedestrian prediction for adaptive crowd navigation,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 3277–3283.
  56. A. Vemula, K. Muelling, and J. Oh, “Modeling cooperative navigation in dense human crowds,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2017, pp. 1685–1692.
  57. R. Bellman, “A markovian decision process,” Journal of mathematics and mechanics, pp. 679–684, 1957.
  58. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017.
  59. C. Maddison, A. Mnih, and Y. Teh, “The concrete distribution: A continuous relaxation of discrete random variables,” in International Conference on Learning Representations (ICLR), 2017.
  60. K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder–decoder for statistical machine translation,” in Conference on Empirical Methods in Natural Language Processing (EMNLP).   Association for Computational Linguistics, 2014, p. 1724.
  61. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  62. S. Pellegrini, A. Ess, K. Schindler, and L. Van Gool, “You’ll never walk alone: Modeling social behavior for multi-target tracking,” in International Conference on Computer Vision (ICCV).   IEEE, 2009, pp. 261–268.
  63. A. Lerner, Y. Chrysanthou, and D. Lischinski, “Crowds by example,” in Computer Graphics Forum, vol. 26, no. 3, 2007, pp. 655–664.
  64. A. Robicquet, A. Sadeghian, A. Alahi, and S. Savarese, “Learning social etiquette: Human trajectory understanding in crowded scenes,” in European Conference on Computer Vision (ECCV).   Springer, 2016, pp. 549–565.
  65. J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha, “Reciprocal n-body collision avoidance,” in International Symposium Robotics Research (ISRR).   Springer, 2011, pp. 3–19.
Citations (6)

Summary

We haven't generated a summary for this paper yet.