Improving Generalization in Game Agents with Data Augmentation in Imitation Learning (2309.12815v3)
Abstract: Imitation learning is an effective approach for training game-playing agents and, consequently, for efficient game production. However, generalization - the ability to perform well in related but unseen scenarios - is an essential requirement that remains an unsolved challenge for game AI. Generalization is difficult for imitation learning agents because it requires the algorithm to take meaningful actions outside of the training distribution. In this paper we propose a solution to this challenge. Inspired by the success of data augmentation in supervised learning, we augment the training data so the distribution of states and actions in the dataset better represents the real state-action distribution. This study evaluates methods for combining and applying data augmentations to observations, to improve generalization of imitation learning agents. It also provides a performance benchmark of these augmentations across several 3D environments. These results demonstrate that data augmentation is a promising framework for improving generalization in imitation learning agents.
- S. Stahlke, A. Nova, and P. Mirza-Babaei, “Artificial playfulness: A tool for automated agent-based playtesting,” in CHI Conference on Human Factors in Computing Systems, 2019.
- C.-S. Cho, K.-M. Sohn, C.-J. Park, and J.-H. Kang, “Online game testing using scenario-based control of massive virtual users,” in International Conference on Advanced Communication Technology, 2010.
- O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev et al., “Grandmaster level in StarCraft II using multi-agent reinforcement learning,” Nature, vol. 575, no. 7782, pp. 350–354, 2019.
- OpenAI, C. Berner, G. Brockman, B. Chan, V. Cheung, P. Debiak, C. Dennison, D. Farhi, Q. Fischer, S. Hashme, C. Hesse, R. Józefowicz, S. Gray, C. Olsson, J. Pachocki, et al., “Dota 2 with large scale deep reinforcement learning,” arXiv preprint 1912.06680, 2019.
- P. R. Wurman, S. Barrett, K. Kawamoto, J. MacGlashan, K. Subramanian, T. J. Walsh, R. Capobianco, A. Devlic, F. Eckert, F. Fuchs et al., “Outracing champion gran turismo drivers with deep reinforcement learning,” Nature, vol. 602, no. 7896, pp. 223–228, 2022.
- J. Bergdahl, C. Gordillo, K. Tollmar, and L. Gisslén, “Augmenting automated game testing with deep reinforcement learning,” in Conference on Games (CoG), 2020.
- C. Gordillo, J. Bergdahl, K. Tollmar, and L. Gisslén, “Improving playtesting coverage via curiosity driven reinforcement learning agents,” in Conference on Games (CoG), 2021.
- A. Sestini, J. Bergdahl, K. Tollmar, A. D. Bagdanov, and L. Gisslén, “Towards informed design and validation assistance in computer games using imitation learning,” in NeurIPS workshop on Human In the Loop Learning, 2022.
- N. Justesen, R. Torrado, P. Bontrager, A. Khalifa, J. Togelius, and S. Risi, “Illuminating generalization in deep reinforcement learning through procedural level generation,” NIPS Workshop on DRL, 2018.
- A. Sestini, L. Gisslén, J. Bergdahl, K. Tollmar, and A. D. Bagdanov, “Automated gameplay testing and validation with curiosity-conditioned proximal trajectories,” IEEE Transactions on Games, 2022.
- K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in conference on computer vision and pattern recognition, 2020, pp. 9729–9738.
- Q. Xie, Z. Dai, E. Hovy, T. Luong, and Q. Le, “Unsupervised data augmentation for consistency training,” Advances in neural information processing systems, vol. 33, pp. 6256–6268, 2020.
- D. Yarats, R. Fergus, A. Lazaric, and L. Pinto, “Mastering visual continuous control: Improved data-augmented reinforcement learning,” arXiv preprint arXiv:2107.09645, 2021.
- S. Sinha, A. Mandlekar, and A. Garg, “S4rl: Surprisingly simple self-supervision for offline reinforcement learning in robotics,” in Conference on Robot Learning. PMLR, 2022, pp. 907–917.
- K. Cobbe, C. Hesse, J. Hilton, and J. Schulman, “Leveraging procedural generation to benchmark reinforcement learning,” in International conference on machine learning. PMLR, 2020, pp. 2048–2056.
- M. Bain and C. Sammut, “A framework for behavioural cloning.” in Machine Intelligence 15, 1995, pp. 103–129.
- S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in 2011 International Conference on Artificial Intelligence and Statistics (ICAIS), 2011.
- J. Ho and S. Ermon, “Generative adversarial imitation learning,” in International Conference on Neural Information Processing Systems, 2016.
- T. Pearce and J. Zhu, “Counter-strike deathmatch with large-scale behavioural cloning,” in 2022 IEEE Conference on Games (CoG). IEEE, 2022, pp. 104–111.
- A. Amiranashvili, N. Dorka, W. Burgard, V. Koltun, and T. Brox, “Scaling imitation learning in minecraft,” arXiv preprint arXiv:2007.02701, 2020.
- K. Chang, B. Aytemiz, and A. M. Smith, “Reveal-more: Amplifying human effort in quality assurance testing using automated exploration,” in 2019 Conference on Games (CoG). IEEE, 2019, pp. 1–8.
- Y. Zhao, I. Borovikov, F. D. M. Silva, A. Beirami, J. Rupert, C. Somers, J. Harder, J. Kolen, J. Pinto, R. Pourabolghasem et al., “Winning isn’t everything: Enhancing game development with intelligent agents,” Transactions on Games, 2020.
- J. Harmer, L. Gisslén, J. del Val, H. Holst, J. Bergdahl, T. Olsson, K. Sjöö, and M. Nordin, “Imitation learning with concurrent actions in 3d games,” in Conference on Computational Intelligence and Games. IEEE, 2018.
- A. Tucker, A. Gleave, and S. Russell, “Inverse reinforcement learning for video games,” in NIPS Workshop on Deep Reinforcement Learning, 2018.
- M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling, “The arcade learning environment: An evaluation platform for general agents,” Journal of Artificial Intelligence Research, vol. 47, pp. 253–279, 2013.
- M. Ferguson, S. Devlin, D. Kudenko, and J. A. Walker, “Imitating playstyle with dynamic time warping imitation,” in International Conference on the Foundations of Digital Games, 2022, pp. 1–11.
- T. Pearce, T. Rashid, A. Kanervisto, D. Bignell, M. Sun, R. Georgescu, S. V. Macua, S. Z. Tan, I. Momennejad, K. Hofmann et al., “Imitating human behaviour with diffusion models,” arXiv preprint arXiv:2301.10677, 2023.
- A. Sestini, A. Kuhnle, and A. D. Bagdanov, “Demonstration-efficient inverse reinforcement learning in procedurally generated environments,” in 2021 Conference on Games (CoG). IEEE, 2021, pp. 1–8.
- ——, “Deepcrawl: Deep reinforcement learning for turn-based strategy games,” arXiv preprint arXiv:2012.01914, 2020.
- C. Romac, R. Portelas, K. Hofmann, and P.-Y. Oudeyer, “Teachmyagent: a benchmark for automatic curriculum learning in deep rl,” in International Conference on Machine Learning. PMLR, 2021, pp. 9052–9063.
- A. Mumuni and F. Mumuni, “Data augmentation: A comprehensive survey of modern approaches,” Array, p. 100258, 2022.
- M. Laskin, K. Lee, A. Stooke, L. Pinto, P. Abbeel, and A. Srinivas, “Reinforcement learning with augmented data,” Advances in neural information processing systems, vol. 33, pp. 19 884–19 895, 2020.
- K. Zolna, S. Reed, A. Novikov, S. G. Colmenarejo, D. Budden, S. Cabi, M. Denil, N. de Freitas, and Z. Wang, “Task-relevant adversarial imitation learning,” in Conference on Robot Learning, 2021.
- H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” arXiv preprint arXiv:1710.09412, 2017.
Collections
Sign up for free to add this paper to one or more collections.