Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning (1708.02190v3)

Published 7 Aug 2017 in cs.AI and cs.LG

Abstract: Intrinsically motivated spontaneous exploration is a key enabler of autonomous developmental learning in human children. It enables the discovery of skill repertoires through autotelic learning, i.e. the self-generation, self-selection, self-ordering and self-experimentation of learning goals. We present an algorithmic approach called Intrinsically Motivated Goal Exploration Processes (IMGEP) to enable similar properties of autonomous learning in machines. The IMGEP architecture relies on several principles: 1) self-generation of goals, generalized as parameterized fitness functions; 2) selection of goals based on intrinsic rewards; 3) exploration with incremental goal-parameterized policy search and exploitation with a batch learning algorithm; 4) systematic reuse of information acquired when targeting a goal for improving towards other goals. We present a particularly efficient form of IMGEP, called AMB, that uses a population-based policy and an object-centered spatio-temporal modularity. We provide several implementations of this architecture and demonstrate their ability to automatically generate a learning curriculum within several experimental setups. One of these experiments includes a real humanoid robot exploring multiple spaces of goals with several hundred continuous dimensions and with distractors. While no particular target goal is provided to these autotelic agents, this curriculum allows the discovery of diverse skills that act as stepping stones for learning more complex skills, e.g. nested tool use.

Authors (4)

Sébastien Forestier (4 papers)
Rémy Portelas (19 papers)
Yoan Mollard (2 papers)
Pierre-Yves Oudeyer (95 papers)

Citations (178)

View on Semantic Scholar

Summary

Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

The paper presented by Forestier et al. introduces a framework for intrinsically motivated goal exploration processes (IMGEP) that aim to simulate properties of autonomous developmental learning observed in human infants. These processes are rooted in the principles of spontaneous exploration driven by intrinsic motivation, which enable collective learning myriads of tasks in both children and AI systems. By leveraging self-generated goals, this framework facilitates enhanced exploration strategies, contributing to a more efficient acquisition of complex skill repertoires, ultimately realized in machines.

The IMGEP framework is built around several core ideas: the generation and exploration of parameterized goals; the exploitation of intrinsic rewards to prioritize areas of exploration; and a strategic reuse of knowledge gained when attempting different goals. Importantly, IMGEP departs from traditional reinforcement learning paradigms by not assuming the presence of a clear, externally defined target task. Instead, it relies on intrinsic learning signals that foster curiosity-driven learning about the environment, serving as heuristic markers for curriculum learning.

One notable implementation discussed in the paper is the Active Model Babbling (AMB), a particular form of IMGEP. The AMB architecture utilizes a population-based approach, which incorporates a modularity entailing both spatial and temporal components, ensuring relevant behavioral features are preserved and explored efficiently. The spatial modularity assigns specific goal spaces to individual objects in an environment, while temporal modularity appropriately constructs an exploration sequence, preserving already discovered stepping stones across tasks.

Numerical robustness of AMB is tested across diverse experimental setups. It is demonstrated that AMB can autonomously discover a curriculum of learning, which enables it to sequentially master increasingly complex skills, such as nested tool uses, in challenging environments. In particular, AMB outperforms several baselines, including random explorations and single goal-centered exploration strategies, suggesting that modular, population-based approaches significantly enhance an agent's learning horizon when supported by insightful goal selection mechanisms.

The paper debates the synergy of learning progress-based intrinsic rewards within IMGEP, showcasing the higher degree of exploration efficiency they stimulate. Agents tailor their exploration strategies towards areas of high learning novelty, sidestepping over-familiar domains and unfeasible goals. These self-organizing processes yield diverse insights, notably in promoting the discovery of intricate task hierarchies and tool-use capabilities in both simulated and real-world robotics.

Implications of this paper are far-reaching, underlining potential directions in developmental AI frameworks where autonomous, open-ended learning becomes scalable. In addition, the modular nature of the IMGEP model parallels challenges seen in complex AI tasks, such as continual and transfer learning, underscoring the importance of hierarchical, adaptable strategies.

Future research directions may dwell upon integrating representation learning within the modular IMGEP framework, addressing real-world scenarios where perceptual abilities are scaffolded over time. Furthermore, scaling these architectures with deep learning techniques could bridge population-based and monolithic great-scale exploration capabilities, enhancing both abstraction and perceptual deployment in autonomous learning systems.

Ultimately, the proposed IMGEP framework marks meaningful progress towards equipping machines with self-driven mechanics for exploration and learning, taking a step closer to replicating the open-ended learning nature present in human cognitive development.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos