Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bias-reduced Multi-step Hindsight Experience Replay for Efficient Multi-goal Reinforcement Learning (2102.12962v3)

Published 25 Feb 2021 in cs.LG, cs.AI, and cs.RO

Abstract: Multi-goal reinforcement learning is widely applied in planning and robot manipulation. Two main challenges in multi-goal reinforcement learning are sparse rewards and sample inefficiency. Hindsight Experience Replay (HER) aims to tackle the two challenges via goal relabeling. However, HER-related works still need millions of samples and a huge computation. In this paper, we propose Multi-step Hindsight Experience Replay (MHER), incorporating multi-step relabeled returns based on $n$-step relabeling to improve sample efficiency. Despite the advantages of $n$-step relabeling, we theoretically and experimentally prove the off-policy $n$-step bias introduced by $n$-step relabeling may lead to poor performance in many environments. To address the above issue, two bias-reduced MHER algorithms, MHER($\lambda$) and Model-based MHER (MMHER) are presented. MHER($\lambda$) exploits the $\lambda$ return while MMHER benefits from model-based value expansions. Experimental results on numerous multi-goal robotic tasks show that our solutions can successfully alleviate off-policy $n$-step bias and achieve significantly higher sample efficiency than HER and Curriculum-guided HER with little additional computation beyond HER.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Rui Yang (221 papers)
  2. Jiafei Lyu (27 papers)
  3. Yu Yang (213 papers)
  4. Jiangpeng Yan (23 papers)
  5. Feng Luo (91 papers)
  6. Dijun Luo (10 papers)
  7. Lanqing Li (21 papers)
  8. Xiu Li (166 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.