Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decentralized policy learning with partial observation and mechanical constraints for multiperson modeling (2007.03155v2)

Published 7 Jul 2020 in cs.LG, cs.MA, and stat.ML

Abstract: Extracting the rules of real-world multi-agent behaviors is a current challenge in various scientific and engineering fields. Biological agents independently have limited observation and mechanical constraints; however, most of the conventional data-driven models ignore such assumptions, resulting in lack of biological plausibility and model interpretability for behavioral analyses. Here we propose sequential generative models with partial observation and mechanical constraints in a decentralized manner, which can model agents' cognition and body dynamics, and predict biologically plausible behaviors. We formulate this as a decentralized multi-agent imitation-learning problem, leveraging binary partial observation and decentralized policy models based on hierarchical variational recurrent neural networks with physical and biomechanical penalties. Using real-world basketball and soccer datasets, we show the effectiveness of our method in terms of the constraint violations, long-term trajectory prediction, and partial observation. Our approach can be used as a multi-agent simulator to generate realistic trajectories using real-world data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 961–971.
  2. Modeling and planning with macro-actions in decentralized pomdps. Journal of Artificial Intelligence Research, 64:817–859.
  3. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
  4. Interaction ruling animal collective behavior depends on topological rather than metric distance: Evidence from a field study. Proceedings of the National Academy of Sciences, 105(4):1232–1237.
  5. Chauffeurnet: Learning to drive by imitating the best and synthesizing the worst. arXiv preprint arXiv:1812.03079.
  6. Red: A simple but effective baseline predictor for the trajnet benchmark. In European Conference on Computer Vision, pages 138–153. Springer.
  7. The complexity of decentralized control of markov decision processes. Mathematics of Operations Research, 27(4):819–840.
  8. Generating defensive plays in basketball games. In Proceedings of the 26th ACM International Conference on Multimedia, pages 1580–1588.
  9. Neural ordinary differential equations. In Advances in Neural Information Processing Systems 31, pages 6571–6583.
  10. A recurrent latent variable model for sequential data. In Advances in Neural Information Processing Systems 28, pages 2980–2988.
  11. Deep neural networks as scientific models. Trends in Cognitive Sciences, 23(4):305–317.
  12. Collective memory and spatial sorting in animal groups. Journal of Theoretical Biology, 218(1):1–11.
  13. Modeling trajectories and trajectory variation of turning vehicles at signalized intersections. IEEE Access, 8:109821–109834.
  14. Learning recurrent representations for hierarchical behavior modeling. In International Conference on Learning Representations.
  15. The coordination of arm movements: an experimentally confirmed mathematical model. Journal of Neuroscience, 5(7):1688–1703.
  16. A disentangled recognition and nonlinear dynamics model for unsupervised learning. In Advances in Neural Information Processing Systems 30, pages 3601–3610.
  17. Sequential neural models with stochastic layers. In Advances in Neural Information Processing Systems 29, pages 2199–2207.
  18. Fujii, K. (2021). Data-driven analysis for understanding team sports behaviors. Journal of Robotics and Mechatronics, 33(3):505–514.
  19. Mutual and asynchronous anticipation and action in sports as globally competitive and locally coordinative dynamics. Scientific Reports, 5.
  20. Dynamic mode decomposition in vector-valued reproducing kernel hilbert spaces for extracting dynamical structure among observables. Neural Networks, 117:94–103.
  21. Physically-interpretable classification of network dynamics for complex collective motions. Scientific Reports, 10(3005):1–13.
  22. Data-driven spectral analysis for coordinative structures in periodic human locomotion. Scientific Reports, 9(1):1–14.
  23. Learning interaction rules from multi-animal trajectories via augmented behavioral models. Advances in Neural Information Processing Systems, 34:11108–11122.
  24. Estimating counterfactual treatment outcomes over time in complex multi-agent scenarios. arXiv preprint arXiv:2206.01900.
  25. Resilient help to switch and overlap hierarchical subsystems in a small human group. Scientific Reports, 6(1):1–10.
  26. The preparatory state of ground reaction forces in defending against a dribbler in a basketball 1-on-1 dribble subphase. Sports Biomechanics, 14(1):28–44.
  27. Z-forcing: Training stochastic recurrent networks. In Advances in Neural Information Processing Systems 30, pages 6713–6723.
  28. Dynamic neural relational inference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  29. Generative attention networks for multi-agent behavioral modeling. In Thirty-Fourth AAAI Conference on Artificial Intelligence.
  30. Social gan: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2255–2264.
  31. Social force model for pedestrian dynamics. Physical Review E, 51(5):4282.
  32. Hoshen, Y. (2017). Vain: Attentional multi-agent predictive modeling. In Advances in Neural Information Processing Systems 30, pages 2701–2711.
  33. Basketballgan: Generating basketball play simulation through sketching. In Proceedings of the 27th ACM International Conference on Multimedia, pages 720–728.
  34. Actor-attention-critic for multi-agent reinforcement learning. In International Conference on Machine Learning, pages 2961–2970.
  35. Generative modeling of multimodal multi-human behavior. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3088–3095. IEEE.
  36. Categorical reparametrization with gumble-softmax. In International Conference on Learning Representations. OpenReview. net.
  37. Learning attentional communication for multi-agent cooperation. In Advances in Neural Information Processing Systems 31, pages 7254–7264.
  38. Composing graphical models with neural networks for structured representations and fast inference. In Advances in Neural Information Processing Systems 29, pages 2946–2954.
  39. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1-2):99–134.
  40. Deep variational bayes filters: Unsupervised learning of state space models from raw data. In International Conference on Learning Representations.
  41. Economy statistical recurrent units for inferring nonlinear granger causality. In International Conference on Learning Representations.
  42. Adam: A method for stochastic optimization. In International Conference on Learning Representations.
  43. Auto-encoding variational bayes. In International Conference on Learning Representations.
  44. Neural relational inference for interacting systems. In International Conference on Machine Learning, pages 2688–2697.
  45. Coordinated multi-agent imitation learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1995–2003.
  46. Social attention for autonomous decision-making in dense traffic. arXiv preprint arXiv:1911.12250.
  47. Unsupervised end-to-end learning of discrete linguistic units for voice conversion. Proc. Interspeech 2019, pages 1108–1112.
  48. Multi-agent game abstraction via graph attention neural network. In Thirty-Fourth AAAI Conference on Artificial Intelligence 34, volume 34, pages 7211–7218.
  49. Amortized causal discovery: Learning to infer causal graphs from time-series data. arXiv preprint arXiv:2006.10833.
  50. The concrete distribution: A continuous relaxation of discrete random variables. In International Conference on Learning Representations.
  51. Modelling the dynamic joint policy of teammates with attention multi-agent ddpg. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pages 1108–1116.
  52. Combinatorial optimization, volume 24. Prentice Hall Englewood Cliffs.
  53. The formation of trajectories during goal-oriented locomotion in humans. ii. a maximum smoothness model. European Journal of Neuroscience, 26(8):2391–2403.
  54. Tighter variational bounds are not necessarily better. In International Conference on Machine Learning, volume 80, pages 4277–4285.
  55. Precog: Prediction conditioned on goals in visual multi-agent settings. In Proceedings of the IEEE International Conference on Computer Vision, pages 2821–2830.
  56. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth International Conference on Artificial Intelligence and Statistics, pages 627–635.
  57. Learning in an uncertain world: Representing ambiguity through multiple hypotheses. In Proceedings of the IEEE International Conference on Computer Vision, pages 3591–3600.
  58. Schaal, S. (1996). Learning from demonstration. Advances in Neural Information Processing Systems, 9:1040–1046.
  59. Predicting the present and future states of multi-agent systems from partially-observed visual data. In International Conference on Learning Representations.
  60. Multiple futures prediction. In Advances in Neural Information Processing Systems 32, pages 15398–15408.
  61. Neural granger causality for nonlinear time series. arXiv preprint arXiv:1802.05842.
  62. Trajectory prediction with imitation learning reflecting defensive evaluation in team sports. In 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE), pages 124–125. IEEE.
  63. Evaluation of creating scoring opportunities for teammates in soccer via trajectory prediction. arXiv preprint arXiv:2206.01899.
  64. Flexible prediction of opponent motion with internal representation in interception behavior. Biological cybernetics, 115(5):473–485.
  65. Formation and control of optimal trajectory in human multijoint arm movement. Biological Cybernetics, 61(2):89–101.
  66. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7794–7803.
  67. A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1(2):270–280.
  68. Diverse generation for multi-agent sports games. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4610–4619.
  69. Autonomous predictive driving for blind intersections. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3452–3459.
  70. Generating multi-agent trajectories using programmatic weak supervision. In International Conference on Learning Representations.
  71. Generating long-term trajectories using deep hierarchical networks. In Advances in Neural Information Processing Systems 29, pages 1543–1551.
Citations (6)

Summary

We haven't generated a summary for this paper yet.