Learning-based primal-dual optimal control of discrete-time stochastic systems with multiplicative noise (2506.02613v1)

Published 3 Jun 2025 in math.OC

Abstract: Reinforcement learning (RL) is an effective approach for solving optimal control problems without knowing the exact information of the system model. However, the classical Q-learning method, a model-free RL algorithm, has its limitations, such as lack of strict theoretical analysis and the need for artificial disturbances during implementation. This paper explores the partially model-free stochastic linear quadratic regulator (SLQR) problem for a system with multiplicative noise from the primal-dual perspective to address these challenges. This approach lays a strong theoretical foundation for understanding the intrinsic mechanisms of classical RL algorithms. We reformulate the SLQR into a non-convex primal-dual optimization problem and derive a strong duality result, which enables us to provide model-based and model-free algorithms for SLQR optimal policy design based on the Karush-Kuhn-Tucker (KKT) conditions. An illustrative example demonstrates the proposed model-free algorithm's validity, showcasing the central nervous system's learning mechanism in human arm movement.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/mathOCb/status/1930135815512113397

Learning-based primal-dual optimal control of discrete-time stochastic systems with multiplicative noise (2506.02613v1)

Summary

Related Papers

Tweets