Decentralized policy learning with partial observation and mechanical constraints for multiperson modeling (2007.03155v2)
Abstract: Extracting the rules of real-world multi-agent behaviors is a current challenge in various scientific and engineering fields. Biological agents independently have limited observation and mechanical constraints; however, most of the conventional data-driven models ignore such assumptions, resulting in lack of biological plausibility and model interpretability for behavioral analyses. Here we propose sequential generative models with partial observation and mechanical constraints in a decentralized manner, which can model agents' cognition and body dynamics, and predict biologically plausible behaviors. We formulate this as a decentralized multi-agent imitation-learning problem, leveraging binary partial observation and decentralized policy models based on hierarchical variational recurrent neural networks with physical and biomechanical penalties. Using real-world basketball and soccer datasets, we show the effectiveness of our method in terms of the constraint violations, long-term trajectory prediction, and partial observation. Our approach can be used as a multi-agent simulator to generate realistic trajectories using real-world data.
- Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 961–971.
- Modeling and planning with macro-actions in decentralized pomdps. Journal of Artificial Intelligence Research, 64:817–859.
- Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
- Interaction ruling animal collective behavior depends on topological rather than metric distance: Evidence from a field study. Proceedings of the National Academy of Sciences, 105(4):1232–1237.
- Chauffeurnet: Learning to drive by imitating the best and synthesizing the worst. arXiv preprint arXiv:1812.03079.
- Red: A simple but effective baseline predictor for the trajnet benchmark. In European Conference on Computer Vision, pages 138–153. Springer.
- The complexity of decentralized control of markov decision processes. Mathematics of Operations Research, 27(4):819–840.
- Generating defensive plays in basketball games. In Proceedings of the 26th ACM International Conference on Multimedia, pages 1580–1588.
- Neural ordinary differential equations. In Advances in Neural Information Processing Systems 31, pages 6571–6583.
- A recurrent latent variable model for sequential data. In Advances in Neural Information Processing Systems 28, pages 2980–2988.
- Deep neural networks as scientific models. Trends in Cognitive Sciences, 23(4):305–317.
- Collective memory and spatial sorting in animal groups. Journal of Theoretical Biology, 218(1):1–11.
- Modeling trajectories and trajectory variation of turning vehicles at signalized intersections. IEEE Access, 8:109821–109834.
- Learning recurrent representations for hierarchical behavior modeling. In International Conference on Learning Representations.
- The coordination of arm movements: an experimentally confirmed mathematical model. Journal of Neuroscience, 5(7):1688–1703.
- A disentangled recognition and nonlinear dynamics model for unsupervised learning. In Advances in Neural Information Processing Systems 30, pages 3601–3610.
- Sequential neural models with stochastic layers. In Advances in Neural Information Processing Systems 29, pages 2199–2207.
- Fujii, K. (2021). Data-driven analysis for understanding team sports behaviors. Journal of Robotics and Mechatronics, 33(3):505–514.
- Mutual and asynchronous anticipation and action in sports as globally competitive and locally coordinative dynamics. Scientific Reports, 5.
- Dynamic mode decomposition in vector-valued reproducing kernel hilbert spaces for extracting dynamical structure among observables. Neural Networks, 117:94–103.
- Physically-interpretable classification of network dynamics for complex collective motions. Scientific Reports, 10(3005):1–13.
- Data-driven spectral analysis for coordinative structures in periodic human locomotion. Scientific Reports, 9(1):1–14.
- Learning interaction rules from multi-animal trajectories via augmented behavioral models. Advances in Neural Information Processing Systems, 34:11108–11122.
- Estimating counterfactual treatment outcomes over time in complex multi-agent scenarios. arXiv preprint arXiv:2206.01900.
- Resilient help to switch and overlap hierarchical subsystems in a small human group. Scientific Reports, 6(1):1–10.
- The preparatory state of ground reaction forces in defending against a dribbler in a basketball 1-on-1 dribble subphase. Sports Biomechanics, 14(1):28–44.
- Z-forcing: Training stochastic recurrent networks. In Advances in Neural Information Processing Systems 30, pages 6713–6723.
- Dynamic neural relational inference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Generative attention networks for multi-agent behavioral modeling. In Thirty-Fourth AAAI Conference on Artificial Intelligence.
- Social gan: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2255–2264.
- Social force model for pedestrian dynamics. Physical Review E, 51(5):4282.
- Hoshen, Y. (2017). Vain: Attentional multi-agent predictive modeling. In Advances in Neural Information Processing Systems 30, pages 2701–2711.
- Basketballgan: Generating basketball play simulation through sketching. In Proceedings of the 27th ACM International Conference on Multimedia, pages 720–728.
- Actor-attention-critic for multi-agent reinforcement learning. In International Conference on Machine Learning, pages 2961–2970.
- Generative modeling of multimodal multi-human behavior. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3088–3095. IEEE.
- Categorical reparametrization with gumble-softmax. In International Conference on Learning Representations. OpenReview. net.
- Learning attentional communication for multi-agent cooperation. In Advances in Neural Information Processing Systems 31, pages 7254–7264.
- Composing graphical models with neural networks for structured representations and fast inference. In Advances in Neural Information Processing Systems 29, pages 2946–2954.
- Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1-2):99–134.
- Deep variational bayes filters: Unsupervised learning of state space models from raw data. In International Conference on Learning Representations.
- Economy statistical recurrent units for inferring nonlinear granger causality. In International Conference on Learning Representations.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations.
- Auto-encoding variational bayes. In International Conference on Learning Representations.
- Neural relational inference for interacting systems. In International Conference on Machine Learning, pages 2688–2697.
- Coordinated multi-agent imitation learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1995–2003.
- Social attention for autonomous decision-making in dense traffic. arXiv preprint arXiv:1911.12250.
- Unsupervised end-to-end learning of discrete linguistic units for voice conversion. Proc. Interspeech 2019, pages 1108–1112.
- Multi-agent game abstraction via graph attention neural network. In Thirty-Fourth AAAI Conference on Artificial Intelligence 34, volume 34, pages 7211–7218.
- Amortized causal discovery: Learning to infer causal graphs from time-series data. arXiv preprint arXiv:2006.10833.
- The concrete distribution: A continuous relaxation of discrete random variables. In International Conference on Learning Representations.
- Modelling the dynamic joint policy of teammates with attention multi-agent ddpg. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pages 1108–1116.
- Combinatorial optimization, volume 24. Prentice Hall Englewood Cliffs.
- The formation of trajectories during goal-oriented locomotion in humans. ii. a maximum smoothness model. European Journal of Neuroscience, 26(8):2391–2403.
- Tighter variational bounds are not necessarily better. In International Conference on Machine Learning, volume 80, pages 4277–4285.
- Precog: Prediction conditioned on goals in visual multi-agent settings. In Proceedings of the IEEE International Conference on Computer Vision, pages 2821–2830.
- A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth International Conference on Artificial Intelligence and Statistics, pages 627–635.
- Learning in an uncertain world: Representing ambiguity through multiple hypotheses. In Proceedings of the IEEE International Conference on Computer Vision, pages 3591–3600.
- Schaal, S. (1996). Learning from demonstration. Advances in Neural Information Processing Systems, 9:1040–1046.
- Predicting the present and future states of multi-agent systems from partially-observed visual data. In International Conference on Learning Representations.
- Multiple futures prediction. In Advances in Neural Information Processing Systems 32, pages 15398–15408.
- Neural granger causality for nonlinear time series. arXiv preprint arXiv:1802.05842.
- Trajectory prediction with imitation learning reflecting defensive evaluation in team sports. In 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE), pages 124–125. IEEE.
- Evaluation of creating scoring opportunities for teammates in soccer via trajectory prediction. arXiv preprint arXiv:2206.01899.
- Flexible prediction of opponent motion with internal representation in interception behavior. Biological cybernetics, 115(5):473–485.
- Formation and control of optimal trajectory in human multijoint arm movement. Biological Cybernetics, 61(2):89–101.
- Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7794–7803.
- A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1(2):270–280.
- Diverse generation for multi-agent sports games. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4610–4619.
- Autonomous predictive driving for blind intersections. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3452–3459.
- Generating multi-agent trajectories using programmatic weak supervision. In International Conference on Learning Representations.
- Generating long-term trajectories using deep hierarchical networks. In Advances in Neural Information Processing Systems 29, pages 1543–1551.