Playing FPS Games with Deep Reinforcement Learning (1609.05521v2)

Published 18 Sep 2016 in cs.AI and cs.LG

Abstract: Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present the first architecture to tackle 3D environments in first-person shooter games, that involve partially observable states. Typically, deep reinforcement learning methods only utilize visual input for training. We present a method to augment these models to exploit game feature information such as the presence of enemies or items, during the training phase. Our model is trained to simultaneously learn these features along with minimizing a Q-learning objective, which is shown to dramatically improve the training speed and performance of our agent. Our architecture is also modularized to allow different models to be independently trained for different phases of the game. We show that the proposed architecture substantially outperforms built-in AI agents of the game as well as humans in deathmatch scenarios.

Authors (2)

Guillaume Lample (31 papers)
Devendra Singh Chaplot (37 papers)

Citations (566)

View on Semantic Scholar

Summary

Overview of "Playing FPS Games with Deep Reinforcement Learning"

The paper "Playing FPS Games with Deep Reinforcement Learning" by Guillaume Lample and Devendra Singh Chaplot provides a comprehensive exploration into the application of deep reinforcement learning (DRL) for first-person shooter (FPS) games within 3D environments. This research marks a significant extension of DRL methodologies, which have predominantly focused on 2D environments with fully observable states.

Key Contributions

The authors introduce a modular architecture specifically designed to handle the complexities of 3D FPS games, characterized by partially observable environments. This architecture distinguishes itself by integrating game feature augmentation and specialized networks for distinct phases of gameplay, such as navigation and action handling.

Technical Innovations

Game Feature Augmentation: The model utilizes internal game engine data, such as the presence of enemies, to enhance the learning process. By co-training a Deep Recurrent Q-Network (DRQN) with game features, the architecture significantly improves in terms of training speed and performance by better guiding convolutional layers to detect critical game elements.
Modular Architecture: The agent's architecture splits into two separate DRL models dedicated to navigation and action phases. This modularization not only enhances training efficiency but also allows for independent optimization of each phase, mitigating issues such as "camper" behavior that can arise from a unified model.
Deep Recurrent Q-Networks (DRQN): The DRQN is employed to address the challenge of partially observable states by utilizing LSTM networks. This allows the network to retain historical context, which is crucial for decision-making in a 3D FPS environment.

Experimental Validation

The proposed architecture is rigorously evaluated on tasks derived from the Visual Doom AI Competition using the ViZDoom API. The results demonstrate its superiority over built-in AI agents and average human players in deathmatch scenarios. Notably, the agent achieves a high kill/death ratio, showcasing its effectiveness in both tactical navigation and combat proficiency.

Implications and Future Prospects

The research suggests several significant implications for incorporating DRL in real-world scenarios:

Robotic Applications: The modular and feature-augmented approach can be adapted for robotics, where navigation through partially known spaces and interaction with dynamic elements are common.
Game AI Development: Introducing modular reinforcement learning architectures can redefine AI development in games, leading to more challenging and human-like game opponents.

For future developments, the integration of advanced DRL techniques such as dueling architectures and prioritized replay could further enhance agent performance. Additionally, applying this framework to other complex environments beyond gaming could extend its applicability.

In conclusion, this paper presents a well-structured approach to handling the complexities of partially observable 3D environments in FPS games, providing a solid foundation for further explorations into autonomous agent development in similarly intricate settings.

PDF Markdown