Recursive formulation of the direct data-based LQR parameterization
Determine a recursive formulation for the direct data-driven LQR design that parameterizes the controller via a data-based matrix G with constraints X0G = I_n and Σ = I_n + X1 G Σ G^{ op} X1^{ op} (with K = U0G), so that the resulting update is suited for online closed-loop adaptation and does not scale with the data length t.
References
However, the dimension of the LQR parameterization (\ref{prob:equi}) scales linearly with $t$, and it is unclear how to turn (\ref{prob:equi}) into a recursive formulation suited for online closed-loop adaptation.
— Data-Enabled Policy Optimization for Direct Adaptive Learning of the LQR
(2401.14871 - Zhao et al., 26 Jan 2024) in Section 2, subsection "Direct LQR design with data-based policy parameterization"