Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning (1909.02877v1)

Published 6 Sep 2019 in cs.LG and stat.ML

Abstract: Full-sampling (e.g., Q-learning) and pure-expectation (e.g., Expected Sarsa) algorithms are efficient and frequently used techniques in reinforcement learning. Q$(\sigma,\lambda)$ is the first approach unifies them with eligibility trace through the sampling degree $\sigma$. However, it is limited to the tabular case, for large-scale learning, the Q$(\sigma,\lambda)$ is too expensive to require a huge volume of tables to accurately storage value functions. To address above problem, we propose a GQ$(\sigma,\lambda)$ that extends tabular Q$(\sigma,\lambda)$ with linear function approximation. We prove the convergence of GQ$(\sigma,\lambda)$. Empirical results on some standard domains show that GQ$(\sigma,\lambda)$ with a combination of full-sampling with pure-expectation reach a better performance than full-sampling and pure-expectation methods.

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning (1909.02877v1)

Summary

Related Papers