Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost (1907.06246v1)

Published 14 Jul 2019 in cs.LG, math.OC, and stat.ML

Abstract: Despite the empirical success of the actor-critic algorithm, its theoretical understanding lags behind. In a broader context, actor-critic can be viewed as an online alternating update algorithm for bilevel optimization, whose convergence is known to be fragile. To understand the instability of actor-critic, we focus on its application to linear quadratic regulators, a simple yet fundamental setting of reinforcement learning. We establish a nonasymptotic convergence analysis of actor-critic in this setting. In particular, we prove that actor-critic finds a globally optimal pair of actor (policy) and critic (action-value function) at a linear rate of convergence. Our analysis may serve as a preliminary step towards a complete theoretical understanding of bilevel optimization with nonconvex subproblems, which is NP-hard in the worst case and is often solved using heuristics.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zhuoran Yang (155 papers)
  2. Yongxin Chen (146 papers)
  3. Mingyi Hong (172 papers)
  4. Zhaoran Wang (164 papers)
Citations (38)