Canonical Bayesian Linear System Identification

Updated 21 July 2025

Canonical Bayesian linear system identification is a framework that uses minimal, unique parameterizations to resolve the redundancy in standard LTI system models.
It employs a fully Bayesian approach with structure-aware priors to produce unimodal, interpretable posteriors that enhance computational efficiency and inference accuracy.
This method offers practical benefits, including improved MCMC performance, reliable uncertainty quantification, and robust applicability in control, robotics, and econometrics.

Canonical Bayesian Linear System Identification addresses the problem of inferring the parameters or structure of linear time-invariant (LTI) systems from observed input–output data within a fully Bayesian framework, with an emphasis on resolving identifiability, incorporating structure-aware priors, and enabling robust uncertainty quantification. Classical Bayesian identification methods suffer from parameter non-identifiability due to state-space symmetry: many redundant parameterizations yield identical system behavior, resulting in highly multimodal posteriors and hindering both inference and interpretation. By embedding canonical (i.e., minimal, unique) parameterizations into the Bayesian inference process, these issues are resolved, leading to computationally efficient, interpretable, and theoretically grounded identification strategies that connect Bayesian and frequentist analyses via the Bernstein–von Mises theorem (Bryutkin et al., 15 Jul 2025).

1. Canonical Form Parameterizations and Non-Identifiability

Non-identifiability arises in the Bayesian identification of LTI systems because different sets of state-space matrices (A, B, C, D), related via invertible similarity transformations, represent the same input–output dynamics: $A' = T^{-1}AT,\quad B' = T^{-1}B,\quad C' = CT,\quad D' = D$ for any invertible matrix $T$ . This redundancy induces a highly symmetric, complicated, and multimodal posterior when performing inference directly in the space of standard state-space parameters.

Canonical forms—such as the controller canonical form for SISO systems, where $A_c$ has a companion matrix structure and only the coefficients of the characteristic and numerator polynomials appear as free parameters—provide a one-to-one correspondence between system parameters and system behavior. Embedding the Bayesian prior and likelihood over the canonical parameter space $\Theta_c$ ensures that each parameter vector corresponds uniquely to a specific transfer function or system realization, removing the equivalence class ambiguity pervasive in the standard parameterization. The result is a well-posed statistical inference problem over a reduced, interpretable parameter space (Bryutkin et al., 15 Jul 2025).

2. Bayesian Inference in Canonical Coordinates and Posterior Properties

Bayesian inference over the canonical parameter space proceeds by constructing the posterior distribution: $p(\Theta_c\,|\,y_{1:T}, u_{1:T}) \propto p(y_{1:T}\,|\,\Theta_c, u_{1:T})\,p(\Theta_c)$ where $p(y_{1:T}\,|\,\Theta_c, u_{1:T})$ is the likelihood of the data under the system specified by canonical parameters $\Theta_c$ .

Key properties:

The likelihood is invariant to coordinate representation, so the canonicalization preserves all information (transfer function, eigenvalues, predictive distributions).
The posterior is well-behaved: it is unimodal and has regular, interpretable geometry, facilitating efficient sampling by modern MCMC methods such as the No-U-Turn Sampler (NUTS).
Effective sample size and mixing rates are improved, especially as the state dimension increases.
The canonical posterior provides interpretable marginal distributions on physically meaningful system quantities (e.g., poles, zeros, stability margins).

For instance, in SISO controller canonical form, the parameter vector $(a_0,\ldots,a_{n-1}, b_0, ..., b_{n-1}, d_0)$ is minimal and generates directly the system’s transfer function

$G(z) = D_c + C_c(zI - A_c)^{-1}B_c$

with all invariants captured precisely (Bryutkin et al., 15 Jul 2025).

3. Structure-Aware Priors and Invariant Constraints

Canonical parameterizations enable the use of structure-aware, interpretable priors:

Priors can be placed directly on the system’s eigenvalues—corresponding to the roots of the characteristic polynomial—allowing explicit enforcement of stability constraints (e.g., all poles within the unit disk in the discrete-time case).
Using Vieta’s formulas, priors over eigenvalue configurations induce priors over the canonical parameters $(a_0, ..., a_{n-1})$ , incorporating the stability indicator function and the Vandermonde Jacobian: $p(a_0, ..., a_{n-1}) = p(\lambda_1, ..., \lambda_n) \left| \det \frac{\partial (\lambda_1, ..., \lambda_n)}{\partial (a_0, ..., a_{n-1})} \right|$ where

$\left| \det D\Psi(\lambda_1, ..., \lambda_n) \right| = \prod_{i<j} |\lambda_i - \lambda_j|$

for SISO systems (Bryutkin et al., 15 Jul 2025).

This facilitates the injection of domain expertise (such as requiring stability or modeling lightly damped/resonant systems) directly and accurately into the Bayesian modeling pipeline.

4. Asymptotic Theory: The Bernstein–von Mises Theorem

A key theoretical advancement is the establishment of the Bernstein–von Mises (BvM) theorem in the canonical Bayesian setting. Under standard regularity conditions and for increasing sample size $T$ , the posterior distribution over canonical parameters $\Theta_c$ converges (in total variation and in distribution) to a Gaussian centered at an efficient estimator (such as MLE or posterior mean), with covariance given by the inverse of the Fisher information matrix: $\sqrt{T}\,(\Theta_c - \hat{\Theta}_c) \xrightarrow{d} \mathcal{N}(0,\,I^{-1}(\Theta_c^0))$ where $\Theta_c^0$ are the true system parameters and $I(\Theta_c^0)$ is the asymptotic Fisher information.

In standard (redundant) coordinates, the Fisher information matrix is singular due to non-identifiability—precluding the BvM result and invalidating the frequentist–Bayesian connection. Canonical forms eliminate this singularity, restoring equivalence between Bayesian credible regions and frequentist confidence sets in large-sample regimes and justifying Gaussian Laplace approximations (Bryutkin et al., 15 Jul 2025).

5. Computational Considerations and Simulation Insights

Empirical and computational outcomes arising from canonical Bayesian linear system identification include:

Superior mixing and effective sample sizes (ESS) in MCMC, particularly for high-dimensional systems or when observations are limited.
Posterior densities over canonical parameters are interpretable and directly related to system invariants, enabling robust and meaningful uncertainty quantification.
In data-limited settings, the method reliably provides credible estimates for dynamical invariants (e.g., poles, transfer function values) and predictive distributions for future outputs.
Simulations demonstrate that even with small data, posterior means (PME) in the canonical space are more interpretable and less biased than those computed from redundant parameterizations; the results are superior in both estimation accuracy and computational speed.
As the observation window $T$ increases, canonical Bayesian inference achieves the predicted Gaussian asymptotic behavior, and credible intervals become consistent with the frequentist confidence regions, in sharp contrast to the behavior of standard approaches (Bryutkin et al., 15 Jul 2025).

6. Practical Impact and Applications

The canonical Bayesian approach confers several practical advantages:

Enables efficient, robust, and interpretable system identification for LTI models in real-world applications—including robotics, control, econometrics, and digital twins—where well-defined uncertainty quantification and credible intervals for invariants (such as stability margins) are critical.
Reduces the computational burden by lowering the dimension of the sampled parameter space and eliminating inefficiencies related to multimodality and symmetry.
Allows the specification of informative priors, enhancing performance in low-data scenarios or for safety-critical applications where stability is non-negotiable.
Because the framework naturally extends to Bayesian model averaging, uncertainty in system order may also be incorporated by defining a prior over model dimension and embedding canonical forms of all possible orders within a unified inferential apparatus.

7. Broader Connections and Extensions

This methodology unifies and clarifies the theoretical landscape of Bayesian system identification. Embedding canonical forms—such as controller or observer canonical representations—resolves representational ambiguities, clarifies the relation between Bayesian and frequentist paradigms via the BvM result, and enables practitioners to utilize structure-aware priors to encode dynamical knowledge. The canonical Bayesian approach thus provides a principled and computationally efficient path for learning, uncertainty quantification, and decision-making in identification, prediction, and control of linear dynamical systems (Bryutkin et al., 15 Jul 2025).

PDF Markdown Chat (Pro)

References (1)

Canonical Bayesian Linear System Identification (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Canonical Bayesian Linear System Identification.