Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies (2210.04810v1)

Published 10 Oct 2022 in math.OC, cs.LG, and stat.ML

Abstract: Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning. This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis, popularized by successes of reinforcement learning. We take an interdisciplinary perspective in our exposition that connects control theory, reinforcement learning, and large-scale optimization. We review a number of recently-developed theoretical results on the optimization landscape, global convergence, and sample complexity of gradient-based methods for various continuous control problems such as the linear quadratic regulator (LQR), $\mathcal{H}_\infty$ control, risk-sensitive control, linear quadratic Gaussian (LQG) control, and output feedback synthesis. In conjunction with these optimization results, we also discuss how direct policy optimization handles stability and robustness concerns in learning-based control, two main desiderata in control engineering. We conclude the survey by pointing out several challenges and opportunities at the intersection of learning and control.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Bin Hu (217 papers)
  2. Kaiqing Zhang (70 papers)
  3. Na Li (227 papers)
  4. Mehran Mesbahi (68 papers)
  5. Maryam Fazel (67 papers)
  6. Tamer Başar (200 papers)
Citations (27)

Summary

We haven't generated a summary for this paper yet.