Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
51 tokens/sec
2000 character limit reached

Canonical Bayesian Linear System Identification (2507.11535v1)

Published 15 Jul 2025 in stat.ML, cs.LG, cs.SY, eess.SY, and stat.CO

Abstract: Standard Bayesian approaches for linear time-invariant (LTI) system identification are hindered by parameter non-identifiability; the resulting complex, multi-modal posteriors make inference inefficient and impractical. We solve this problem by embedding canonical forms of LTI systems within the Bayesian framework. We rigorously establish that inference in these minimal parameterizations fully captures all invariant system dynamics (e.g., transfer functions, eigenvalues, predictive distributions of system outputs) while resolving identifiability. This approach unlocks the use of meaningful, structure-aware priors (e.g., enforcing stability via eigenvalues) and ensures conditions for a Bernstein--von Mises theorem -- a link between Bayesian and frequentist large-sample asymptotics that is broken in standard forms. Extensive simulations with modern MCMC methods highlight advantages over standard parameterizations: canonical forms achieve higher computational efficiency, generate interpretable and well-behaved posteriors, and provide robust uncertainty estimates, particularly from limited data.

Summary

  • The paper proposes a canonical Bayesian framework that overcomes non-identifiability by reformulating LTI systems into unique, minimal representations.
  • It leverages structure-aware priors on eigenvalues to enforce stability and achieve interpretable uncertainty quantification across system parameters.
  • Empirical evaluations show that the canonical approach improves sampling efficiency and estimation accuracy, outperforming classical methods in low-data regimes.

Canonical Bayesian Linear System Identification: An Expert Overview

This paper addresses a fundamental challenge in Bayesian identification of linear time-invariant (LTI) systems: the non-identifiability of standard state-space parameterizations. The authors propose a rigorous Bayesian framework that leverages canonical forms to resolve this issue, enabling efficient inference, interpretable priors, and robust uncertainty quantification. The work is notable for its theoretical contributions—establishing identifiability, enabling structure-aware priors, and proving a Bernstein–von Mises (BvM) theorem—as well as for its comprehensive empirical validation.

Problem Setting and Motivation

LTI systems are ubiquitous in control, signal processing, and scientific modeling. The standard state-space model,

xt+1=Axt+But+wt,yt=Cxt+Dut+zt,x_{t+1} = A x_t + B u_t + w_t, \quad y_t = C x_t + D u_t + z_t,

is parameterized by matrices (A,B,C,D)(A, B, C, D), with process and measurement noise. The system identification problem is to infer these matrices from observed input-output data.

Classical identification methods (e.g., prediction error, subspace, frequency-domain) provide point estimates but lack uncertainty quantification and are agnostic to prior knowledge. Bayesian approaches, in principle, address these limitations by treating parameters as random variables and updating beliefs via the posterior. However, the standard parameterization is non-identifiable: many (A,B,C,D)(A, B, C, D) yield the same input-output behavior due to similarity transformations. This leads to highly multimodal, non-Gaussian posteriors that are difficult to sample and interpret.

Canonical Forms and Identifiability

The core innovation is to perform Bayesian inference in a canonical parameterization of the LTI system. Canonical forms (e.g., controller or observer canonical form for SISO systems) provide a unique, minimal representation for each equivalence class of input-output behavior. The authors rigorously prove that:

  • Equivalence classes: All minimal LTI systems related by similarity transformations are statistically isomorphic; they induce identical likelihoods for any input-output data.
  • Canonical sufficiency: Inference on the canonical parameters Θc\Theta_c is sufficient to recover the posterior over all system properties invariant under similarity (e.g., transfer function, eigenvalues, Hankel matrix).
  • Prior consistency: Any prior on the standard parameter space induces a well-defined prior on the canonical space, and structure-aware priors (e.g., on eigenvalues) can be specified directly and coherently.

This resolves the non-identifiability problem and reduces the parameter space from O(dx2)O(d_x^2) to O(dx)O(d_x) for SISO systems, with analogous (though more complex) reductions for MIMO systems.

Structure-Aware Priors

A major practical advantage of canonical forms is the ability to specify informative, interpretable priors. The authors detail how to:

  • Place priors directly on the eigenvalues of AA, enforcing stability and other dynamical constraints.
  • Use Vieta’s formulas to map eigenvalue priors to priors on canonical coefficients, with explicit Jacobian corrections.
  • Handle mixtures of real and complex eigenvalues, and analyze the induced distributions for low-dimensional cases.

This enables practitioners to encode domain knowledge (e.g., stability margins, oscillatory behavior) in a principled way, which is infeasible in the standard parameterization.

Posterior Geometry and Asymptotics

The canonical parameterization yields posteriors that are typically unimodal and well-behaved, in contrast to the complex, multimodal posteriors in the standard form. The authors prove that:

  • The Fisher information matrix (FIM) is non-singular in the canonical form, enabling asymptotic normality.
  • A Bernstein–von Mises theorem holds: as data increases, the posterior over canonical parameters converges to a Gaussian centered at an efficient estimator, with covariance given by the inverse FIM.
  • In the standard parameterization, the FIM is singular and the BvM theorem fails; the posterior remains diffuse over the non-identifiable directions.

This justifies the use of Gaussian approximations and Laplace methods for uncertainty quantification in large-sample regimes, but only in the canonical form.

Empirical Evaluation

The paper presents extensive numerical experiments, demonstrating:

  • Posterior geometry: Canonical posteriors are unimodal and interpretable; standard posteriors are highly multimodal and correlated.
  • Sampling efficiency: MCMC (NUTS) achieves much higher effective sample size per second in the canonical form, with better mixing and convergence.
  • Estimation accuracy: Bayesian inference in canonical form outperforms classical methods (e.g., Ho–Kalman) in low-data and noisy regimes, especially when informative priors are used.
  • Prior impact: Informative, stability-enforcing priors yield substantial gains in small-sample settings.
  • Asymptotic behavior: Empirical posteriors converge to the Gaussian predicted by the BvM theorem as data increases.
  • Scalability: The canonical approach scales better with system dimension, both in computation and sampling.

Practical Implications

For practitioners, the canonical Bayesian approach enables:

  • Reliable, interpretable uncertainty quantification for LTI system identification.
  • Incorporation of domain knowledge via structure-aware priors.
  • Efficient inference, even in high-dimensional or data-limited settings.
  • Robustness to ill-conditioning and non-identifiability.

The framework is directly applicable to SISO systems and, with additional care, to MIMO systems. The authors provide implementation details, including efficient likelihood and gradient computation via Kalman filtering and automatic differentiation.

Limitations and Future Directions

  • Model order selection: The framework assumes known state dimension; Bayesian model selection (e.g., via marginal likelihood) is computationally intensive and remains an open challenge.
  • MIMO systems: Canonical forms are more complex and less unique; hybrid or adaptive schemes may be needed for efficient inference.
  • Scalability: While improved over standard forms, full Bayesian inference remains more expensive than point estimation; variational or stochastic gradient methods may be needed for very large systems.
  • Nonlinear systems: Extending the approach to nonlinear state-space models, with analogous decomposition into identifiable and non-identifiable components, is a promising but challenging direction.

Conclusion

This work provides a rigorous and practical solution to the longstanding problem of non-identifiability in Bayesian LTI system identification. By embedding canonical forms within the Bayesian framework, the authors enable efficient, interpretable, and robust inference, with strong theoretical guarantees and demonstrated empirical benefits. The approach is broadly applicable and sets a new standard for uncertainty quantification in system identification. Future work on model order selection, scalable inference, and extensions to nonlinear systems will further enhance its impact.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Youtube Logo Streamline Icon: https://streamlinehq.com

alphaXiv