Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning (1909.03245v3)

Published 7 Sep 2019 in cs.LG, cs.AI, and stat.ML

Abstract: Model-free deep reinforcement learning (RL) algorithms have been widely used for a range of complex control tasks. However, slow convergence and sample inefficiency remain challenging problems in RL, especially when handling continuous and high-dimensional state spaces. To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations. Specifically, we first explain how policy iteration can be applied directly with Anderson acceleration. Then we extend RAA to the case of deep RL by introducing a regularization term to control the impact of perturbation induced by function approximation errors. We further propose two strategies, i.e., progressive update and adaptive restart, to enhance the performance. The effectiveness of our method is evaluated on a variety of benchmark tasks, including Atari 2600 and MuJoCo. Experimental results show that our approach substantially improves both the learning speed and final performance of state-of-the-art deep RL algorithms.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Wenjie Shi (6 papers)
  2. Shiji Song (103 papers)
  3. Hui Wu (54 papers)
  4. Ya-Chu Hsu (1 paper)
  5. Cheng Wu (31 papers)
  6. Gao Huang (178 papers)
Citations (25)

Summary

We haven't generated a summary for this paper yet.