Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance (2111.09159v3)

Published 17 Nov 2021 in cs.LG and cs.AI

Abstract: Recent advances in model-free deep reinforcement learning (DRL) show that simple model-free methods can be highly effective in challenging high-dimensional continuous control tasks. In particular, Truncated Quantile Critics (TQC) achieves state-of-the-art asymptotic training performance on the MuJoCo benchmark with a distributional representation of critics; and Randomized Ensemble Double Q-Learning (REDQ) achieves high sample efficiency that is competitive with state-of-the-art model-based methods using a high update-to-data ratio and target randomization. In this paper, we propose a novel model-free algorithm, Aggressive Q-Learning with Ensembles (AQE), which improves the sample-efficiency performance of REDQ and the asymptotic performance of TQC, thereby providing overall state-of-the-art performance during all stages of training. Moreover, AQE is very simple, requiring neither distributional representation of critics nor target randomization. The effectiveness of AQE is further supported by our extensive experiments, ablations, and theoretical results.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yanqiu Wu (12 papers)
  2. Xinyue Chen (28 papers)
  3. Che Wang (15 papers)
  4. Yiming Zhang (128 papers)
  5. Keith W. Ross (7 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.