CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning (1810.06284v4)

Published 15 Oct 2018 in cs.AI

Abstract: In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration. They may consider a large diversity of goals, aiming to discover what is controllable in their environments, and what is not. Because some goals might prove easy and some impossible, agents must actively select which goal to practice at any moment, to maximize their overall mastery on the set of learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a modular Universal Value Function Approximator with hindsight learning to achieve a diversity of goals of different kinds within a unique policy and 2) an automated curriculum learning mechanism that biases the attention of the agent towards goals maximizing the absolute learning progress. Agents focus sequentially on goals of increasing complexity, and focus back on goals that are being forgotten. Experiments conducted in a new modular-goal robotic environment show the resulting developmental self-organization of a learning curriculum, and demonstrate properties of robustness to distracting goals, forgetting and changes in body properties.

Authors (5)

Cédric Colas (27 papers)
Pierre Fournier (4 papers)
Olivier Sigaud (56 papers)
Mohamed Chetouani (36 papers)
Pierre-Yves Oudeyer (95 papers)

Citations (40)

View on Semantic Scholar

Summary

Intrinsically Motivated Modular Multi-Goal Reinforcement Learning Using CURIOUS

The paper "CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning" presents an innovative approach to reinforcement learning (RL) in open-ended environments, where agents autonomously learn a repertoire of skills. The algorithm, CURIOUS, combines modular goal representations with intrinsically driven curriculum learning, enabling agents to achieve and optimize across a spectrum of goals within a unified policy framework.

Overview

CURIOUS introduces a modular variant of the Universal Value Function Approximator (UVFA), extending its application to a continuous set of diverse goals within a single RL policy. The key innovation lies in seamlessly integrating modular goal representations with a dynamic curriculum learning mechanism, which directs an autonomous agent's focus toward goals that maximize its learning progress. This approach not only emphasizes diversity in goal attainment but also facilitates knowledge transfer across goals, enhancing overall mastery.

Methodology

A fundamental component of this work is the Modular Universal Value Function Approximator (M-UVFA), which combines goal parameterization with hindsight learning. This enables agents to tackle an array of distinct goals, such as Reach, Push, Pick and Place, and Stack, all within a unified architecture. Additionally, the algorithm implements an active learning strategy rooted in Intrinsically Motivated Goal Exploration Processes (IMGEP), prioritizing goals that show the highest absolute learning progress.

CURIOUS operates in a setting where distractors and changes in the environment or the agent's body could lead to forgetting or bias in goal selection. The algorithm's solution is a developmental self-organizing curriculum that incrementally advances the agent's capabilities. This is evidenced through various experimental domains, demonstrating resilience against distractions and adaptability to shifts in bodily conditions.

Results and Discussion

Experiments reveal substantial benefits of using a modular goal representation, particularly in environments with multitask demands. The results show that agents using CURIOUS achieve superior skill transfer and maintain robustness against goal disruption and sensor changes compared to traditional flat multi-goal RL approaches. The developmental aspect of learning allows agents to autonomously phase their focus, often paralleling curriculum learning observed in human infants.

Beyond practical achievements, CURIOUS also pushes the theoretical boundaries of modular RL, highlighting the flexibility and potential of UVFA in diverse RL settings. The paper encourages further exploration of hierarchical modules and autonomous goal setting, paving the way for AI that can independently construct its learning framework.

Implications and Future Directions

The implications of this paper are significant for the field of reinforcement learning, particularly for autonomous agents intended to operate in dynamic and unpredictable environments. CURIOUS provides a foundation for developing future RL systems capable of genuinely autonomous behavior, visioning agents that not only perform tasks but also self-direct their learning trajectories.

From a practical standpoint, the CURIOUS framework suggests enhanced efficiency in learning complex tasks where numerous subgoals might overlap or interact. The introduction of learning progress-based attention mechanisms can be influential in applications spanning robotics, artificial intelligence, and even cognitive simulations of learning processes.

Potential future developments could explore more sophisticated models of competence and learning progress estimations, as well as scaling CURIOUS to handle larger and more intricate task environments. Integrating CURIOUS with unsupervised learning methods to autonomously identify and construct modular goal spaces presents another exciting avenue for research, aimed at further reducing dependencies on manually curated goal representations.

In conclusion, the CURIOUS algorithm offers a promising expansion of the capabilities of RL frameworks by leveraging intrinsic motivation and modularity to enhance autonomous learning, setting a new benchmark for research in AI-driven skill acquisition.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos