TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL (2103.09815v2)

Published 17 Mar 2021 in cs.LG

Abstract: Training autonomous agents able to generalize to multiple tasks is a key target of Deep Reinforcement Learning (DRL) research. In parallel to improving DRL algorithms themselves, Automatic Curriculum Learning (ACL) study how teacher algorithms can train DRL agents more efficiently by adapting task selection to their evolving abilities. While multiple standard benchmarks exist to compare DRL agents, there is currently no such thing for ACL algorithms. Thus, comparing existing approaches is difficult, as too many experimental parameters differ from paper to paper. In this work, we identify several key challenges faced by ACL algorithms. Based on these, we present TeachMyAgent (TA), a benchmark of current ACL algorithms leveraging procedural task generation. It includes 1) challenge-specific unit-tests using variants of a procedural Box2D bipedal walker environment, and 2) a new procedural Parkour environment combining most ACL challenges, making it ideal for global performance assessment. We then use TeachMyAgent to conduct a comparative study of representative existing approaches, showcasing the competitiveness of some ACL algorithms that do not use expert knowledge. We also show that the Parkour environment remains an open problem. We open-source our environments, all studied ACL algorithms (collected from open-source code or re-implemented), and DRL students in a Python package available at https://github.com/flowersteam/TeachMyAgent.

Authors (4)

Clément Romac (8 papers)
Rémy Portelas (19 papers)
Katja Hofmann (59 papers)
Pierre-Yves Oudeyer (95 papers)

Citations (18)

View on Semantic Scholar

Summary

The paper introduces TeachMyAgent, a new benchmark platform that evaluates ACL algorithms in Deep RL using innovative procedural task generation.
The paper performs a comparative analysis of multiple ACL methods across environments like Stump Tracks and Parkour under different expert knowledge levels.
The paper reveals that certain ACL algorithms perform competitively without heavy expert input, highlighting opportunities for advancing robust curriculum strategies.

TeachMyAgent: A Benchmark for Automatic Curriculum Learning in Deep RL

The research paper "TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL" presents a comprehensive benchmarking platform specifically designed for evaluating Automatic Curriculum Learning (ACL) algorithms. The authors recognize the paucity of standardized benchmarks in the ACL space despite its increasing relevance in Deep Reinforcement Learning (DRL). By introducing TeachMyAgent, they aim to bridge this gap and facilitate a structured comparison of ACL methods.

Challenges and Benchmark Development

The paper identifies key challenges that ACL algorithms face, such as mostly unfeasible task spaces, robustness to diverse students, and rugged difficulty landscapes. It addresses these challenges by introducing TeachMyAgent, which includes a procedural task generation system with two main environments: Stump Tracks and Parkour. Each is designed to test specific ACL capabilities. Stump Tracks features a set of unit tests, while Parkour combines multiple ACL challenges in a single environment, making it suitable for assessing the global performance of curriculum generation strategies.

Methodology and ACL Algorithms

The paper conducts an in-depth comparative analysis of several ACL approaches, including RIAC, Covar-GMM, ALP-GMM, Goal-GAN, Setter-Solver, ADR, and SPDL, as well as a random task selection baseline. The comparative studies cover different levels of prior knowledge - none, low, and high expert knowledge to evaluate how these methods perform under varying conditions.

Findings and Implications

The results illustrate that current ACL methods vary significantly in their efficacy across different challenges, with some algorithms exhibiting strong performance without relying heavily on prior expert knowledge. Notably, the results on the Parkour environment reveal this task space remains largely unsolved, suggesting room for future research and innovation in ACL methods.

The findings emphasize the competitive nature of algorithms that do not require expert knowledge and are indicative of the potential to devise efficient ACL strategies without heavy resource expenditures. Furthermore, the benchmark highlights certain weaknesses of existing ACL algorithms, such as their inability to efficiently adapt to student forgetting or navigate rugged difficulty landscapes.

Future Directions

TeachMyAgent provides a valuable tool for the ACL research community, creating opportunities to explore more complex task spaces and refined teaching algorithms. Future research could focus on extending this benchmark to include more diverse environments and higher-dimensional task spaces. The implications of the paper advocate for advancements in curriculum learning strategies that could lead to the development of more generalized and efficient DRL systems.

By open-sourcing TeachMyAgent, the authors encourage community-driven enhancements and anticipate that the benchmark will inspire the ongoing evolution of curriculum learning methodologies in reinforcement learning. This collaborative framework sets the stage for adaptive learning systems capable of handling increasingly complex and varied learning tasks.

PDF Markdown

Related Papers

GitHub

GitHub - flowersteam/TeachMyAgent: TeachMyAgent is a testbed platform for Automatic Curriculum Learning methods in Deep RL. (64 stars)

Tweets

https://twitter.com/ClementRomac/status/1864972670263042421

YouTube

Show All Videos