Learning-Enhanced Observer (LEO)

Updated 27 November 2025

Learning-Enhanced Observer (LEO) is a data-driven augmentation that integrates learning mechanisms with classical state estimators to refine model parameters and compensate for unknown nonlinearities.
It employs tailored loss functions and gradient-based optimization to tune surrogate models and observer gains, ensuring convergence and stability for LTI and nonlinear systems.
Empirical results demonstrate over 15% reduction in normalized estimation errors, validated through Monte Carlo simulations and robust pole placement techniques.

A Learning-Enhanced Observer (LEO) is a data-driven augmentation to classical observer design for dynamical systems, typically formulated to address parametric uncertainties or structural mismatches in system models. LEO frameworks blend classic state-estimation schemes—such as Luenberger or Kazantzis-Kravaris-Luenberger (KKL) observers—with learning-based mechanisms that empirically refine model parameters or compensate for unknown nonlinearities, thereby yielding statistically and practically significant improvements in estimation accuracy and robustness across both linear-time-invariant (LTI) and nonlinear systems (Shu, 20 Nov 2025, Miao et al., 2022, Chakrabarty et al., 2020).

1. Classical Observer Foundations and Model Parameterization

At the core, LEO operates on discrete- or continuous-time LTI models: $x_{k+1} = A\,x_k + B\,u_k, \quad y_k = C\,x_k$ or

$\dot{x}(t) = A\,x(t) + B\,u(t), \quad y(t) = C\,x(t)$

with $x \in \mathbb{R}^n$ , $u \in \mathbb{R}^p$ , $y \in \mathbb{R}^q$ and nominal, but possibly uncertain, system matrices $A,B,C$ .

Traditional Luenberger observers take the form: $\hat x_{k+1} = A\,\hat x_k + B\,u_k + L(y_k - C\,\hat x_k)$ where $L$ is the observer gain ensuring $A-LC$ is Schur.

LEO generalizes this by promoting $(A, B, C)$ (and often the initial observer state $\hat x_0$ ) to trainable variables: $\bar x_{k+1} = \hat A\,\bar x_k + \hat B\,u_k, \quad \bar y_k = \hat C\,\bar x_k$ This surrogate, parameterized model is iteratively fitted to observed trajectories, thereby tailoring the observer to the true system, even in the presence of moderate model uncertainty (Shu, 20 Nov 2025).

2. Learning Objectives, Loss Functions, and Robustness Mechanisms

LEO frameworks define a learning objective that blends state/output discrepancy with model regularization. For LTI systems, the steady-state output discrepancy loss is: $\mathcal{L}(\hat A,\hat B,\hat C,\hat x_0) = \frac{1}{K}\sum_{k=k_0}^{k_0+K-1} \left\|y_k - \hat C\,\hat x_k\right\|_1 + \lambda_A \|\hat A - A_{\rm init}\|_F + \lambda_B \|\hat B - B_{\rm init}\|_F + \lambda_C \|\hat C - C_{\rm init}\|_F$ where the regularization terms penalize large deviations from the nominal model, and normalization ensures scale invariance (Shu, 20 Nov 2025).

For nonlinear systems, especially with neural ODE-based LEOs, the loss generalizes to an integrated mean squared error with gain regularization: $\mathcal L = \int_{t_0}^{t_f} \| x(t) - \hat x(t) \|^2\,dt + \gamma \|\theta\|^2$ where $\theta$ are observer parameters such as eigenvalues in a KKL observer (Miao et al., 2022).

Noise augmentation and regularization during training are used to mitigate overfitting and manage the intrinsic convergence speed-robustness trade-off. Excessive gain magnitudes or aggressive pole placement can accelerate convergence but amplify steady-state error under disturbance; this trade-off is quantitatively characterized and actively tuned during LEO training (Miao et al., 2022).

3. Optimization, Constraints, and Final Observer Construction

Parameters are updated via generic gradient-based optimizers (e.g., Adam or SGD). The learning-phase loop typically comprises:

Forward simulation of observer dynamics.
Computation of the loss $\mathcal{L}$ .
Backpropagation to compute gradients.
Parameter update:

$\hat{A} \leftarrow \hat{A} - \eta \frac{\partial \mathcal{L}}{\partial \hat{A}}, \text{ etc.}$

Observer gain $L$ is recomputed via pole placement to ensure desired closed-loop stability. Observability and conditioning are monitored, with fallback to previous stable gains if observability is lost (Shu, 20 Nov 2025).

Upon convergence, the final “learning-enhanced observer” is reconstructed using the optimized surrogate model: $\hat x_{k+1} = \hat A^\star\,\hat x_k + \hat B^\star\,u_k + L^\star(y_k - \hat C^\star\,\hat x_k)$ where $L^\star$ ensures stability for the learned parameters.

For fully unknown or nonlinear systems, extended LEOs may employ Bayesian optimization to update coefficients of a function basis expansion ( $p_t$ in $\phi(q)$ ) and redesign observer gains using information from a Gaussian process surrogate of the cost surface, with all learning steps provably maintaining local input-to-state stability (L-ISS) (Chakrabarty et al., 2020).

4. Theoretical Guarantees and Statistical Validation

Theoretical results underlying LEO constructions include:

Finite-window local matching: An LTV (noisy) system can be locally matched by an LTI surrogate $(\hat A, \hat B, \hat C)$ if certain rank conditions are satisfied (Shu, 20 Nov 2025).
Regularization justification: Bounds on the mismatch in initial states between two LTI systems with close system matrices further motivate regularization (Shu, 20 Nov 2025).
Lyapunov-based L-ISS: Convex LMI-based conditions guarantee boundedness of estimation errors even during online parameter learning (Chakrabarty et al., 2020).
Trade-off analysis: For KKL-type LEOs, estimation error bounds make explicit the dependence on gain magnitude, convergence rate, and noise amplification (Miao et al., 2022).

Empirical validation is performed via Monte Carlo simulations across a diversity of systems:

For LTI systems, dimensions $n=2,3,4$ and various input-output channels $(p,q)$ are used, with 100 random draws per configuration. LEO achieves a mean reduction in normalized steady-state estimation error exceeding 15% in open-loop and closed-loop testing, with a median success rate above 70% and all $p$ -values $\ll 10^{-3}$ (Wilcoxon significance) (Shu, 20 Nov 2025).
In nonlinear settings, benchmarks include Van-der-Pol and Duffing oscillators. Comparisons versus Recurrent NN, high-gain, and robust adaptive observers consistently show that the LEO approach yields lower root-mean-square estimation errors and maintains safety under learning (Miao et al., 2022, Chakrabarty et al., 2020).

5. Algorithms, Implementation, and Software

A platform-independent pseudocode unifies the typical LEO training loop:

initialize nominal A, B, C; set initial x0
initialize surrogate model parameters hat_A, hat_B, hat_C, hat_x0
design initial observer gain L (stability constraint)
for epoch in range(max_epochs):
    simulate observer state trajectory under current parameters
    compute loss L(hat_A, hat_B, hat_C, hat_x0) over steady-state window
    backpropagate gradients via autodiff
    update parameters via Adam/SGD
    if (hat_A, hat_C) observable: update L via pole placement
    else: retain previous L (rare)
return optimized hat_A*, hat_B*, hat_C*, hat_x0*, L*
deploy final observer with these parameters

(Shu, 20 Nov 2025)

State-of-the-art implementations are provided, e.g., at https://github.com/Hao-B-Shu/LTI_LEO, featuring scripts for synthetic data generation, training with PyTorch/Adam, discrete-time pole placement, observability checks, and empirical evaluation of error reduction metrics. For Bayesian optimization-based LEOs, standard GP and acquisition function (expected improvement) routines are applied (Chakrabarty et al., 2020).

LEO constructions have been extended to nonlinear systems by:

Learning residual dynamics as neural network modules (e.g., $\hat{g}(z;\theta)$ in observer ODEs), thus generalizing Luenberger and KKL observers to arbitrary smooth dynamics while maintaining observability and convergence through learnable or fixed stabilizing gains (Miao et al., 2022).
Employing basis-expansion (e.g., Legendre polynomials or neural networks) with coefficients learned online using Bayesian optimization, while robust observer gains are designed using convex LMI feasibility programs (Chakrabarty et al., 2020).

Correspondence with observer-theoretic concepts:

LEO retains classic structural guarantees (convergence, stability) whenever the optimized gains satisfy standard spectral constraints (Schur/Hurwitz).
The learning phase typically does not sacrifice safety (bounded estimation error) even under full model uncertainty, in contrast to purely data-driven (e.g., RNN-based) observers.
The LEO approach has demonstrated generalization limitations typical of machine learning approaches if training domains are not sufficiently representative (Miao et al., 2022).

7. Benchmarks, Practical Impact, and Outlook

LEO frameworks deliver statistically significant improvements (average error reduction $>$ 15%) without major alterations to classical observer structures, requiring only mild assumptions about observability and system parameter initialization. Safety and performance certificates are available for both linear and nonlinear cases, and LEO outperforms a range of conventional and machine learning-based baselines on standard benchmarks, even when those baselines are provided with greater a priori knowledge of the plant model (Shu, 20 Nov 2025, Miao et al., 2022, Chakrabarty et al., 2020).

The modularity of LEO—encompassing observer-theoretic stability, empirical model refinement, and robust learning—positions it as a tractable and principled tool for state estimation in the presence of realistic uncertainties and structural mismatches, with both theoretical and implementation assets available to practitioners and researchers.