2000 character limit reached
Multi-task Reinforcement Learning with a Planning Quasi-Metric (2002.03240v3)
Published 8 Feb 2020 in cs.LG and stat.ML
Abstract: We introduce a new reinforcement learning approach combining a planning quasi-metric (PQM) that estimates the number of steps required to go from any state to another, with task-specific "aimers" that compute a target state to reach a given goal. This decomposition allows the sharing across tasks of a task-agnostic model of the quasi-metric that captures the environment's dynamics and can be learned in a dense and unsupervised manner. We achieve multiple-fold training speed-up compared to recently published methods on the standard bit-flip problem and in the MuJoCo robotic arm simulator.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.