A Deep Hierarchical Approach to Lifelong Learning in Minecraft (1604.07255v3)

Published 25 Apr 2016 in cs.AI and cs.LG

Abstract: We propose a lifelong learning system that has the ability to reuse and transfer knowledge from one task to another while efficiently retaining the previously learned knowledge-base. Knowledge is transferred by learning reusable skills to solve tasks in Minecraft, a popular video game which is an unsolved and high-dimensional lifelong learning problem. These reusable skills, which we refer to as Deep Skill Networks, are then incorporated into our novel Hierarchical Deep Reinforcement Learning Network (H-DRLN) architecture using two techniques: (1) a deep skill array and (2) skill distillation, our novel variation of policy distillation (Rusu et. al. 2015) for learning skills. Skill distillation enables the HDRLN to efficiently retain knowledge and therefore scale in lifelong learning, by accumulating knowledge and encapsulating multiple reusable skills into a single distilled network. The H-DRLN exhibits superior performance and lower learning sample complexity compared to the regular Deep Q Network (Mnih et. al. 2015) in sub-domains of Minecraft.

PDF Abstract

A Deep Hierarchical Approach to Lifelong Learning in Minecraft

This paper presents a novel approach to addressing lifelong learning challenges in high-dimensional environments, specifically using the context of Minecraft, a video game known for its complexity and diverse tasks. The proposed solution involves the creation of a hierarchical deep reinforcement learning architecture termed as the Hierarchical Deep Reinforcement Learning Network (H-DRLN).

Overview of H-DRLN

At the core of H-DRLN is the concept of leveraging hierarchical structures to facilitate skill reuse and acquisition, which are pivotal to lifelong learning. This is realized through two components:

Deep Skill Networks (DSNs): These are pre-trained networks that encapsulate reusable skills. Each DSN represents a distinct task-related policy learning outcome in Minecraft's sub-domains (e.g., navigation, item pickup). These networks form the base for reusability in lifelong learning.
Skill Distillation: This technique encapsulates multiple skills into a single distilled network, reducing the overhead associated with maintaining numerous task-specific networks. This approach extends traditional policy distillation by focusing on skill encapsulation.

Technical Contributions

The paper advances the field by:

Introducing a modular deep reinforcement learning infrastructure that efficiently integrates pre-trained skills into a hierarchical learning framework, enhancing the scalability of lifelong learning systems.
Demonstrating the use of skill distillation as a method to consolidate and scale the learning of reusable skills within the H-DRLN framework. This consolidation allows for significant reduction in sample complexity while maintaining superior performance characteristics in task learning across various sub-domains of Minecraft.

Empirical Results

The paper includes rigorous empirical evaluations showing that H-DRLN significantly outperforms the standard Deep Q-Networks (DQNs) in various sub-domains of Minecraft. Key observations include:

Performance and Convergence: H-DRLN showed improved performance and convergence rates owing to its ability to reuse previously acquired skills effectively.
Task Transferability: The system demonstrated competency in learning generalized solutions that could handle new and related tasks with minimal additional training. The skill transfer ability was shown to yield higher reward rates compared to non-hierarchical approaches.

Implications and Future Work

The research implications suggest that hierarchical strategies combined with efficient skill transfer mechanisms hold the key to advancing AI systems toward true lifelong learning capabilities. The integration of deep hierarchical structures can potentially be applied to other complex environments and tasks beyond Minecraft, including real-world applications requiring a blend of adaptability and scalability.

Looking forward, future developments may involve:

Online skill learning and refinement which could further enhance the adaptability of the H-DRLN system in dynamic environments.
Extending the framework to handle real-world scenarios by integrating real-time skill acquisition and online policy adaptation.

Conclusion

The paper effectively merges hierarchical deep reinforcement learning with the challenges of lifelong learning, using Minecraft as an apt sandbox for exploration and validation. By demonstrating the viability of the H-DRLN to learn and transfer knowledge across tasks with varying complexities, it paves the way for future AI systems that can seamlessly learn and adapt across lifetimes of interaction.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Chen Tessler (24 papers)
Shahar Givony (1 paper)
Tom Zahavy (41 papers)
Daniel J. Mankowitz (28 papers)
Shie Mannor (228 papers)

Citations (361)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/ChenTessler/status/1911311550033408286

YouTube

Show All Videos