Evolving simple programs for playing Atari games (1806.05695v1)

Published 14 Jun 2018 in cs.NE and cs.AI

Abstract: Cartesian Genetic Programming (CGP) has previously shown capabilities in image processing tasks by evolving programs with a function set specialized for computer vision. A similar approach can be applied to Atari playing. Programs are evolved using mixed type CGP with a function set suited for matrix operations, including image processing, but allowing for controller behavior to emerge. While the programs are relatively small, many controllers are competitive with state of the art methods for the Atari benchmark set and require less training time. By evaluating the programs of the best evolved individuals, simple but effective strategies can be found.

Citations (61)

View on Semantic Scholar

Summary

The paper applies Cartesian Genetic Programming (CGP) to evolve simple, fixed-length programs for playing Atari games, extending CGP's use to reinforcement learning tasks.
Evolved CGP programs demonstrated competitive performance on Atari games, occasionally surpassing advanced AI methods like deep reinforcement learning in specific tasks.
The concise nature of CGP programs offers interpretability, providing insights into evolved game strategies unlike complex deep neural network controllers.

Overview of "Evolving Simple Programs for Playing Atari Games"

The paper "Evolving Simple Programs for Playing Atari Games" introduces an exploration into the application of Cartesian Genetic Programming (CGP) to Atari game playing, expanding the use of CGP from its established role in image processing to reinforcement learning tasks. The approach entails evolving simple programs with function sets conducive to handling matrix operations, thereby enabling these programs to process raw pixel input from Atari games and generate competitive game-playing strategies.

Methodological Framework

The authors describe the implementation of CGP using a floating point representation where each node in the program corresponds to four genes, representing function choice, inputs, and additional parameters. The evolution process is guided by a $1+\lambda$ evolutionary algorithm with optimized parameters, allowing for the mutation of nodes and outputs. Further, the function set employed includes basic mathematical operations and list processing functionalities, ensuring compatibility with pixel-based inputs and outputs required for game interactions.

The yielded programs are notably concise, maintaining a fixed-length genome which facilitates program analysis. This presents an advantage over deep neural network controllers, which typically involve greater complexity and opaque decision-making processes. CGP thus allows direct inspection of the evolved strategies, potentially elucidating simple yet effective game actions.

Empirical Evaluation and Results

Experiments were conducted within the Arcade Learning Environment (ALE), a standard benchmark for assessing AI game-playing performance. Scores from CGP programs were compared against those achieved by human players and various state-of-the-art AI methods, including deep reinforcement learning strategies such as A3C, Dueling, and Prioritized replay networks.

The CGP programs demonstrated competitive performance on multiple Atari games, outperforming several advanced AI approaches on specific tasks. Notably, CGP excelled in games like Kung-Fu Master and Boxing by encoding simplistic strategies that capitalized on game-specific heuristics. For instance, by repeatedly executing a crouching punch in Kung-Fu Master, the agent effectively minimized damage while maximizing game score.

Implications and Future Work

The paper's findings suggest that CGP can serve as a viable alternative to more complex AI methods in reinforcement learning environments, especially where interpretability of the decision-making process holds significant value. The succinctness of CGP programs allows for better insight into game strategies, potentially informing human approaches to these games.

Looking ahead, the authors propose adopting novelty metrics to overcome local optima that hinder evolutionary progress. Additionally, evolving policies sensitive to specific frames may enhance the adaptive capabilities of evolved programs. Comparative analyses of computational efficiency, as well as standardizing evaluation metrics against deep learning frameworks, remain essential undertakings to fully establish CGP's efficacy in this domain.

Overall, the application of CGP to Atari games represents a novel intersection between genetic programming and AI game playing, offering insights into efficient policy encoding with potential practical benefits in domains requiring transparent logic representation.

Related Papers

YouTube

Show All Videos