Ergodic-risk Criterion for Stochastically Stabilizing Policy Optimization
Abstract: This paper introduces ergodic-risk criteria, which capture long-term cumulative risks associated with controlled Markov chains through probabilistic limit theorems--in contrast to existing methods that require assumptions of either finite hitting time, finite state/action space, or exponentiation necessitating light-tailed distributions. Using tailored Functional Central Limit Theorems (FCLT), we demonstrate that the time-correlated terms in the ergodic-risk] criteria converge under uniform ergodicity and establish conditions for the convergence of these criteria in non-stationary general-state Markov chains involving heavy-tailed distributions. For quadratic risk functionals on stochastic linear system, in addition to internal stability, this requires the (possibly heavy-tailed) process noise to have only a finite fourth moment. After quantifying cumulative uncertainties in risk functionals that account for extreme deviations, these ergodic-risk criteria are then incorporated into policy optimizations, thereby extending the standard average optimal synthesis to a risk-sensitive framework. Finally, by establishing the strong duality of the constrained policy optimization, we propose a primal-dual algorithm that optimizes average performance while ensuring that certain risks associated with these ergodic-risk criteria are constrained. Our risk-sensitive framework offers a theoretically guaranteed policy iteration for the long-term risk-sensitive control of processes involving heavy-tailed noise, which is shown to be effective through several simulations.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.