Bayesian Hamiltonian Learning

Updated 22 May 2026

Bayesian Hamiltonian Learning is a probabilistic framework that infers unknown Hamiltonians by treating model parameters and latent states as random variables within a hierarchical model.
It integrates physics-based constraints, structure-preserving integrators, and noise modeling to yield accurate, uncertainty-quantified predictions and guide adaptive experimental designs.
Applied to quantum, classical, and dissipative systems, BHL enables scalable reduced-order modeling that robustly preserves physical dynamics even in noisy, high-dimensional settings.

Bayesian Hamiltonian Learning (BHL) is a rigorous statistical framework for inferring unknown Hamiltonians and their associated dynamics from experimental or simulated data by treating all model parameters, latent states, noise sources, and often the structure of the Hamiltonian itself as random variables within a hierarchical probabilistic model. BHL integrates prior physical constraints, data fidelity, and noise modeling into a unified posterior that quantifies uncertainty and guides adaptive experiment design, system identification, and predictive simulation. BHL methodologies are applied to quantum, classical, and dissipative systems, and have enabled scalable learning with uncertainty quantification, robustness to structured and heavy-tailed noise, and physical structure preservation in high-dimensional settings.

1. Hamiltonian Learning Problem: Formulations and Models

The Hamiltonian learning problem involves estimating the unknown parameters or functional form of a Hamiltonian $H$ governing a dynamical system, given observed data. Systems of interest include classical mechanics, quantum dynamics (especially spin systems and superconducting qubits), nonseparable and separable Hamiltonian systems, as well as generalized dissipative and port-Hamiltonian models (Evans et al., 2019, Beckers et al., 2023, Galioto et al., 2024).

Nonseparable Hamiltonian Systems

For a system with positions $q \in \mathbb{R}^d$ , momenta $p \in \mathbb{R}^d$ , and state $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ , the general Hamiltonian model is

$\dot q = \frac{\partial H(q,p;\theta)}{\partial p}, \quad \dot p = -\frac{\partial H(q,p;\theta)}{\partial q},$

where $H$ may be parameterized as a sum of a quadratic form and a nonlinear function (neural network expansion or polynomial), with parameters $\theta$ (Galioto et al., 2024).

Generalized and Dissipative Systems

For dissipative or port-Hamiltonian systems,

$\dot x = [J(x)-R(x)]\nabla H(x) + G(x)u,$

where $J$ is skew-symmetric, $R$ is positive semi-definite, and $q \in \mathbb{R}^d$ 0 parameterizes input-port couplings (Beckers et al., 2023, Ewering et al., 7 Nov 2025). Both classical and quantum models are encompassed.

Measurement Models and Noise

Observations $q \in \mathbb{R}^d$ 1 are modeled as

$q \in \mathbb{R}^d$ 2

where $q \in \mathbb{R}^d$ 3 (multiplicative noise) and $q \in \mathbb{R}^d$ 4 (additive noise) are vector-valued and may be statistically dependent (Galioto et al., 2024). Non-Gaussian, structured, and correlated noise models are directly supported in Bayesian formulations.

Quantum Hamiltonian Parameterizations

For quantum systems, especially $q \in \mathbb{R}^d$ 5-body models,

$q \in \mathbb{R}^d$ 6

where $q \in \mathbb{R}^d$ 7 are Pauli strings of weight $q \in \mathbb{R}^d$ 8. Control Hamiltonians and state preparation augment identifiability (Evans et al., 2019).

2. Bayesian Inference: Posterior Structure and Noise Marginalization

The cornerstone of BHL is the explicit construction and inference of joint posteriors over all unknowns:

$q \in \mathbb{R}^d$ 9

where $p \in \mathbb{R}^d$ 0 is the observed data, $p \in \mathbb{R}^d$ 1 are Hamiltonian and possibly dynamical parameters, $p \in \mathbb{R}^d$ 2 collects hyperparameters governing noise processes, priors enforce structure or domain knowledge, and $p \in \mathbb{R}^d$ 3 is the noisy-data likelihood (Galioto et al., 2024).

Likelihood Construction

For systems with measurement and process noise (multiplicative and additive), Gaussian filter approximations yield tractable marginal likelihoods for hidden Markov models, using predicted and updated means/covariances in a Kalman/unscented-Kalman framework (Galioto et al., 2024, Kim et al., 31 Jan 2025).
For quantum measurement, Bayesian updates are performed via the Born-rule likelihood or constraint equations derived from commutation relations of observables (Evans et al., 2019).

Priors and Hyperpriors

Flat, uninformative, or Gaussian priors are used for $p \in \mathbb{R}^d$ 4.
Hyperpriors (e.g., half-normal over variances) enforce positivity and regularize high-dimensional parameterizations (Galioto et al., 2024).
Physics-informed priors (e.g., Gaussian process priors on $p \in \mathbb{R}^d$ 5) embed smoothness and invariance constraints (Beckers et al., 2023).

Posterior Inference Algorithms

Algorithm	Use Case	Complexity/Remarks
Gaussian filtering	Nonlinear SSMs, additive/mult. noise	$p \in \mathbb{R}^d$ 6, reduced to $p \in \mathbb{R}^d$ 7 in ROM
Sequential Monte Carlo	Online quantum parameter learning	Tunable by particles/resampling, adaptive experimental design
MCMC/Metropolis-Gibbs	Parameter/hyperparameter sampling	Suitable for MAP and full posterior uncertainty quantification
Particle Gibbs/PGAS	Joint state and parameter inference	Enables tractable full Bayesian inference for latent SSMs

Computations are enabled by differentiable likelihoods (for gradient-based MAP), analytical conjugacy (for GPs), and reduced-rank surrogates for scalability (Evans et al., 2019, Ewering et al., 7 Nov 2025, Galioto et al., 2024).

3. Structure Preservation, Reduced-Order and GP Surrogates

Structure-Preserving Integrators

Parametrizing the flow through a Hamiltonian and employing explicit symplectic integrators (e.g., Tao's explicit scheme for nonseparable systems) ensures learned models are symplectic, preserving energy and phase-space properties (Galioto et al., 2024).
For generalized Hamiltonian flows, structure-preserving kernels or basis expansions (e.g., symplectic SVD, random Fourier features) ensure volume preservation and dissipativity where required (McLennan et al., 8 Sep 2025).

Reduced-Order Modeling

For high-dimensional systems (e.g., discretized PDEs on $p \in \mathbb{R}^d$ $p \in R^{d}$ 8 state variables), BHL employs reduced-order projection:
- Cotangent-lifted symplectic SVD yields a symplectic basis $p \in \mathbb{R}^d$ 9 ( $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 0), projecting dynamics and measurements for fast learning and filtering (Galioto et al., 2024).
- Hamiltonian operator inference (H-OpInf) can recover the quadratic part analytically, reducing the burden on the nonlinear surrogate.

Gaussian Process and Random Feature Surrogates

Hamiltonians are modeled nonparametrically by GPs:
- For port-Hamiltonian and input–output systems, the Hamiltonian $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 1 is given a GP prior. Uncertainty in $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 2 and its derivatives propagates through trajectories, yielding a full posterior over future states and outputs (Beckers et al., 2023, Ewering et al., 7 Nov 2025).
- Sparse random Fourier feature expansions of GPs yield scalable surrogates, facilitating variational Bayesian inference with physics-informed regularization (McLennan et al., 8 Sep 2025).

4. Online, Adaptive, and Data-Efficient Bayesian Protocols

Adaptive Experiment Design

BHL protocols for quantum systems employ Bayesian experimental design: utilities such as information gain or expected risk determine optimal next experiments, using current posterior estimates—substantially accelerating information gain per data point (Granade et al., 2012, Hincks et al., 2018).
Online updating of the posterior as data accrue allows real-time feedback and efficient control (Evans et al., 2019, Kim et al., 31 Jan 2025).

Data Efficiency and Robustness

BHL approaches are robust to small, noisy, and highly corrupted datasets (e.g., datasets with $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 3 or $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 4 multiplicative noise), outperforming standard ML (e.g., deep HNNs) by orders of magnitude in mean squared error for Hamiltonian inference (Galioto et al., 2024).
Training in reduced or projected spaces yields $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 5– $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 6 computational speedups without loss of physical interpretability.
The Bayesian posterior naturally quantifies uncertainty and expands credibly as noise or data sparsity increases.

5. Empirical Performance and Benchmarks

On canonical nonseparable Hamiltonian benchmarks, BHL achieves up to $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 7 lower Hamiltonian mean squared error than standard HNN objectives with $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 8 multiplicative noise (Galioto et al., 2024).
In chaotic, underdetermined, or high-dimensional settings (e.g., double pendulum, NLSE PDEs), BHL methods robustly capture phase-space structure, provide credible predictive envelopes, and maintain conservation laws under out-of-sample trajectory continuation (Galioto et al., 2024, Beckers et al., 2023).
For quantum many-body spin chains (up to $x=(q^\top,p^\top)^\top\in\mathbb{R}^{2d}$ 9 qubits), the scalability of BHL is demonstrated: the posterior quickly concentrates as the number of control fields increases, and credible intervals match empirical errors (Evans et al., 2019).

6. Physical Consistency, Extensions, and Limitations

BHL intrinsically enforces physical correctness through structure-preserving learning (symplecticity, passivity, conservation, stability) (Galioto et al., 2024, Beckers et al., 2023, McLennan et al., 8 Sep 2025).
Extensions include hierarchical models for time-varying parameters, hyperparameter learning, non-conservative and input–output system identification, and compositional port-Hamiltonian networks (Ewering et al., 7 Nov 2025, Beckers et al., 2023).
Computational scaling depends on surrogate and filtering strategies: cubic in data for full GPs, $\dot q = \frac{\partial H(q,p;\theta)}{\partial p}, \quad \dot p = -\frac{\partial H(q,p;\theta)}{\partial q},$ 0 in projected spaces, and linear in time and particle number for SMC/PGAS approaches.
Current limitations include scaling of dense GP models to large datasets and identifiability for some structural parameters in input–output settings (Ewering et al., 7 Nov 2025).

In summary, Bayesian Hamiltonian Learning provides a holistic, physics-informed statistical apparatus for identifying, simulating, and controlling complex Hamiltonian systems under uncertainty. Its strength lies in the principled integration of measurement noise, physical constraints, reduced-order modeling, uncertainty quantification, and adaptive design, with applications across quantum, classical, conservative, and dissipative regimes (Galioto et al., 2024, Evans et al., 2019, Beckers et al., 2023, McLennan et al., 8 Sep 2025, Ewering et al., 7 Nov 2025).