On Linear Convergence of Policy Gradient Methods for Finite MDPs (2007.11120v2)

Published 21 Jul 2020 in cs.LG, math.OC, and stat.ML

Abstract: We revisit the finite time analysis of policy gradient methods in the one of the simplest settings: finite state and action MDPs with a policy class consisting of all stochastic policies and with exact gradient evaluations. There has been some recent work viewing this setting as an instance of smooth non-linear optimization problems and showing sub-linear convergence rates with small step-sizes. Here, we take a different perspective based on connections with policy iteration and show that many variants of policy gradient methods succeed with large step-sizes and attain a linear rate of convergence.

Citations (57)

View on Semantic Scholar

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

On Linear Convergence of Policy Gradient Methods for Finite MDPs (2007.11120v2)

Collections

Summary

Follow-up Questions

Related Papers

Authors (2)