Global convergence of policy gradient methods for LQG with noise via IOH parameterization
Establish global convergence guarantees for policy gradient methods applied to the linear quadratic Gaussian dynamic output‑feedback control problem under input–output‑history (IOH) parameterization in the presence of Gaussian process and measurement noise, extending the noise‑free convergence results of IOH‑based policy gradient methods to the noisy LQG setting.
References
However, proving the global convergence of PGMs to LQG control problems with noise inputs by extending the result in is not straightforward (as discussed in Section~\ref{Sec3C}) and remains an open challenge.
— Policy Gradient Method for LQG Control via Input-Output-History Representation: Convergence to $O(ε)$-Stationary Points
(2510.19141 - Sadamoto et al., 22 Oct 2025) in Related Works (Introduction)