Reinforcement Learning with Augmented Data (2004.14990v5)

Published 30 Apr 2020 in cs.LG and stat.ML

Abstract: Learning from visual observations is a fundamental yet challenging problem in Reinforcement Learning (RL). Although algorithmic advances combined with convolutional neural networks have proved to be a recipe for success, current methods are still lacking on two fronts: (a) data-efficiency of learning and (b) generalization to new environments. To this end, we present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms. We perform the first extensive study of general data augmentations for RL on both pixel-based and state-based inputs, and introduce two new data augmentations - random translate and random amplitude scale. We show that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods across common benchmarks. RAD sets a new state-of-the-art in terms of data-efficiency and final performance on the DeepMind Control Suite benchmark for pixel-based control as well as OpenAI Gym benchmark for state-based control. We further demonstrate that RAD significantly improves test-time generalization over existing methods on several OpenAI ProcGen benchmarks. Our RAD module and training code are available at https://www.github.com/MishaLaskin/rad.

Authors (6)

Michael Laskin (20 papers)
Kimin Lee (69 papers)
Adam Stooke (12 papers)
Lerrel Pinto (81 papers)
Pieter Abbeel (372 papers)
Aravind Srinivas (20 papers)

Citations (598)

View on Semantic Scholar

Summary

Reinforcement Learning with Augmented Data: An Expert Overview

The paper "Reinforcement Learning with Augmented Data" by Laskin et al. addresses two critical challenges in Reinforcement Learning (RL): data efficiency and generalization. Introduced is a technique named RAD (Reinforcement Learning with Augmented Data), which applies data augmentation to enhance RL algorithms without altering their core structure. This approach is demonstrated to consistently improve performance across various benchmarks.

Key Contributions

Data Augmentation in RL: The paper pioneers an extensive exploration of data augmentations within RL. It incorporates techniques such as random translate, crop, color jitter, patch cutout, and amplitude scale, specifically aiming to improve learning from visual and state-based inputs.
New Augmentation Techniques: Introduced are two novel augmentations for RL—random translate for pixel inputs and random amplitude scale for proprioceptive inputs—previously unused in RL contexts.
Benchmark Performance: RAD achieves new state-of-the-art results in data efficiency and performance on both the DeepMind Control Suite and OpenAI Gym, excelling against complex state-of-the-art methods. Additionally, it demonstrates improved generalization on OpenAI ProcGen benchmarks.

Numerical Results and Implications

DeepMind Control Suite: RAD's results surpass previous methods, showing a significant increase in data efficiency. Notably, a 4x improvement was observed over pixel SAC, and RL performance equaled state SAC on most test environments.
OpenAI Gym: Applying RAD with random amplitude scaling led to superior results compared to several model-based and model-free methods, showcasing broad applicability beyond pixel inputs.
ProcGen Generalization: RAD outperformed standard PPO in generalization tests, even when trained with fewer environment variations, thus proving effective in tasks that require transferring learned policies to unseen environments.

Practical and Theoretical Implications

From a practical standpoint, RAD provides a straightforward, plug-and-play solution for improving RL algorithms, applicable to both visual and state inputs. This modularity facilitates its integration without altering existing RL architectures, resulting in enhanced data efficiency and generalization at relatively low computational cost.

Theoretically, RAD underscores the significance of data augmentation—a principle effectively utilized in computer vision—in the context of RL. It raises crucial considerations about the role of inductive biases in RL and how augmenting input data can guide learning processes towards better policy representations.

Future Directions

Future research may explore the combination of RAD with other auxiliary learning strategies, such as contrastive learning, to further enrich policy representations. Investigating the limits of data augmentation in dynamic and more complex real-world environments could also provide insights into developing robust RL systems.

In conclusion, Laskin et al.'s RAD technique represents a notable advancement in applying data augmentation to reinforcement learning. By addressing fundamental challenges of efficiency and generalization, RAD sets a new benchmark and opens avenues for further research in enhancing RL frameworks.

PDF Markdown

Related Papers

GitHub

GitHub - MishaLaskin/rad: RAD: Reinforcement Learning with Augmented Data (402 stars)
GitHub - MishaLaskin/rad: RAD: Reinforcement Learning with Augmented Data (402 stars)

YouTube

Show All Videos