Reset-Free Lifelong Learning with Skill-Space Planning (2012.03548v3)

Published 7 Dec 2020 in cs.LG, cs.AI, and cs.RO

Abstract: The objective of lifelong reinforcement learning (RL) is to optimize agents which can continuously adapt and interact in changing environments. However, current RL approaches fail drastically when environments are non-stationary and interactions are non-episodic. We propose Lifelong Skill Planning (LiSP), an algorithmic framework for non-episodic lifelong RL based on planning in an abstract space of higher-order skills. We learn the skills in an unsupervised manner using intrinsic rewards and plan over the learned skills using a learned dynamics model. Moreover, our framework permits skill discovery even from offline data, thereby reducing the need for excessive real-world interactions. We demonstrate empirically that LiSP successfully enables long-horizon planning and learns agents that can avoid catastrophic failures even in challenging non-stationary and non-episodic environments derived from gridworld and MuJoCo benchmarks.

Citations (36)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Reset-Free Lifelong Learning with Skill-Space Planning (2012.03548v3)

Summary

Related Papers