Meta-learning curiosity algorithms (2003.05325v1)

Published 11 Mar 2020 in cs.LG and stat.ML

Abstract: We hypothesize that curiosity is a mechanism found by evolution that encourages meaningful exploration early in an agent's life in order to expose it to experiences that enable it to obtain high rewards over the course of its lifetime. We formulate the problem of generating curious behavior as one of meta-learning: an outer loop will search over a space of curiosity mechanisms that dynamically adapt the agent's reward signal, and an inner loop will perform standard reinforcement learning using the adapted reward signal. However, current meta-RL methods based on transferring neural network weights have only generalized between very similar tasks. To broaden the generalization, we instead propose to meta-learn algorithms: pieces of code similar to those designed by humans in ML papers. Our rich language of programs combines neural networks with other building blocks such as buffers, nearest-neighbor modules and custom loss functions. We demonstrate the effectiveness of the approach empirically, finding two novel curiosity algorithms that perform on par or better than human-designed published curiosity algorithms in domains as disparate as grid navigation with image inputs, acrobot, lunar lander, ant and hopper.

Authors (4)

Ferran Alet (14 papers)
Martin F. Schneider (1 paper)
Leslie Pack Kaelbling (94 papers)
Tomas Lozano-Perez (28 papers)

Citations (60)

View on Semantic Scholar

Summary

Meta-Learning Curiosity Algorithms

The paper "Meta-learning curiosity algorithms" explores the concept of curiosity as an intrinsic mechanism that evolutionarily drives agents towards meaningful exploration, which ultimately enables them to achieve high rewards throughout their lifetime. This notion of curiosity is hypothesized to be a pivotal exploratory strategy, encouraging transitions into environments that reveal rewarding knowledge and behaviors. This research introduces meta-learning as an innovative approach to formulating and improving curiosity-driven exploration algorithms.

Objective and Methodology

The central objective is presenting a meta-learning framework to develop curiosity algorithms for reinforcement learning (RL) agents. The authors distinguish between an outer loop that searches over a vast space of curiosity mechanisms adapting reward signals, and an inner loop where standard RL exploits these signals. Inspired by evolution, this mechanism dynamically adapts rewards to promote exploration and effective learning.

The paper critiques the limitations of current meta-RL methods that utilize neural network weights transfer, which primarily generalizes between narrowly similar tasks. To overcome this limitation, the authors propose meta-learning algorithms combining neural networks with various components such as buffers, nearest-neighbor modules, and custom loss functions.

Technical Contributions

The authors enumerate three significant contributions:

Domain-Specific Language (DSL): The development of a DSL encapsulates sophisticated components like neural networks, gradient-descent mechanisms, learned objective functions, ensembles, buffers, and other regressors. This DSL aims to identify curiosity modules effectively generalizable across diverse environments, regardless of their state-action spaces.
Efficient Search Strategies: To navigate a large combinatorial space of potential solutions—where evaluating a single solution entails executing RL algorithms for extensive timesteps—the authors devise efficient pruning strategies using benchmark environments to discard less promising programs early.
Empirical Validation: The paper empirically validates the novel approach by discovering curiosity algorithms that perform comparably or exceed established human-designed exploration algorithms across various domains such as grid navigation, acrobot, lunar lander, ant, and hopper environments. The discovered algorithms demonstrate significant robustness and adaptability across these domains.

Findings and Implications

The research presents numerous findings with theoretical and practical implications:

Discovery of Novel Algorithms: The paper uncovers curiosity algorithms that were previously unexplored but surprisingly effective, illustrating the potential of automated algorithm discovery.
Generalization Across Environments: The ability to generalize across diverse environments underscores the capacity of meta-learned algorithms to transcend task-specific boundaries traditionally faced in RL.
Future Directions: The paper suggests that the meta-learning approach could extend beyond curiosity algorithms, potentially transforming other domains such as optimization and algorithm design across various AI applications.

Conclusion

In summary, this work substantially contributes to the exploration strategies in RL through the meta-learning curiosity module approach. While the generalization challenge remains substantial, the authors offer a promising direction by demonstrating the ability to learn mechanisms potentially applicable to varied tasks and environments. This suggests a roadmap towards more adaptive, autonomous learning systems capable of intrinsic motivation-driven exploration.

Related Papers

YouTube

Show All Videos