Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Policy Gradient Methods for Discrete Time Linear Quadratic Regulator With Random Parameters (2303.16548v2)

Published 29 Mar 2023 in math.OC and cs.LG

Abstract: This paper studies an infinite horizon optimal control problem for discrete-time linear system and quadratic criteria, both with random parameters which are independent and identically distributed with respect to time. In this general setting, we apply the policy gradient method, a reinforcement learning technique, to search for the optimal control without requiring knowledge of statistical information of the parameters. We investigate the sub-Gaussianity of the state process and establish global linear convergence guarantee for this approach based on assumptions that are weaker and easier to verify compared to existing results. Numerical experiments are presented to illustrate our result.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. M. Aoki. Optimal Control and System Theory in Dynamic Economic Analysis. Number Vol. 1 in A Series of Volumes in Dynamic Economics : Theory and Applications. North Holland Publishing Company, 1976.
  2. The uncertainty threshold principle: Some fundamental limitations of optimal decision making under dynamic uncertainty. IEEE Transactions on Automatic Control, 22(3):491–495, 1977.
  3. Discrete-time optimal control with control-dependent noise and generalized riccati difference equations. Automatica, 34(8):1031–1034, 1998.
  4. Willem L De Koning. Infinite horizon optimal control of linear discrete time systems with stochastic parameters. Automatica, 18(4):443–453, 1982.
  5. On the sample complexity of the linear quadratic regulator. Foundations of Computational Mathematics, 20(4):633–679, 2020.
  6. R Drenick and L Shaw. Optimal control of linear plants with random parameters. IEEE Transactions on Automatic Control, 9(3):236–244, 1964.
  7. A q-learning algorithm for discrete-time linear-quadratic control with random parameters of unknown distribution: convergence and stabilization. SIAM Journal on Control and Optimization, 60(4):1991–2015, 2022.
  8. Global convergence of policy gradient methods for the linear quadratic regulator. In International Conference on Machine Learning, pages 1467–1476. PMLR, 2018.
  9. Learning optimal controllers for linear systems with multiplicative noise via policy gradient. IEEE Transactions on Automatic Control, 66(11):5283–5298, 2020.
  10. Policy gradient methods for the noisy linear quadratic regulator over a finite horizon. SIAM Journal on Control and Optimization, 59(5):3359–3391, 2021.
  11. Matrix analysis. Cambridge university press, 2012.
  12. R. E. Kalman. Control of randomly varying linear dynamical systems. Proceedings of Symposia in Applied Mathematics, pages 287–298, 1961.
  13. Further results on the uncertainty threshold principle. IEEE Transactions on Automatic Control, 22(5):866–868, 1977.
  14. Model-free optimal control of discrete-time systems with additive and multiplicative noises. Automatica, 147:110685, 2023.
  15. Toader Morozan. Stabilization of some stochastic discrete–time control systems. Stochastic Analysis and Applications, 1(1):89–116, 1983.
  16. Learning without mixing: Towards a sharp analysis of linear system identification. In Conference On Learning Theory, pages 439–473. PMLR, 2018.
  17. AR Tiedemann and WL De Koning. The equivalent discrete-time optimal control problem for continuous-time systems with stochastic parameters. International Journal of Control, 40(3):449–466, 1984.
  18. The gap between model-based and model-free methods on the linear quadratic regulator: An asymptotic viewpoint. In Conference on Learning Theory, pages 3036–3083. PMLR, 2019.
  19. Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018.
  20. Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge University Press, 2019.

Summary

We haven't generated a summary for this paper yet.