Bayesian Inference for ODE Models

Updated 1 December 2025

Bayesian ODE inference frameworks are probabilistic models that use prior distributions and likelihoods to quantify uncertainty in dynamic system parameters.
Modern methods combine simulation-based integration, state-space filtering, and spline or neural/GP surrogates to achieve flexible and efficient inference.
These approaches are applied in fields like epidemiology, ecology, and chemical kinetics to ensure robust parameter recovery and calibrated predictive uncertainty.

A Bayesian inference framework for ordinary differential equation (ODE) models provides a principled means to quantify parameter uncertainty, propagate model uncertainty through nonlinear dynamics, and perform model-based prediction with calibrated uncertainty intervals. Modern approaches span classical parametric and nonparametric formulations, state-space and filtering representations, spline or collocation expansions, fully simulation-based solver integration, and hybrid learning with neural networks and Gaussian processes. These frameworks are applicable across scientific domains involving continuous-time dynamical systems, from epidemiology and ecology to chemical kinetics and engineered physical systems.

1. Model Formulation and Bayesian Structure

In the general Bayesian ODE setting, a multivariate latent state $x(t)\in\mathbb R^n$ evolves according to

$\frac{dx}{dt} = f(x(t), u(t), d(t), t;\, w)\,,$

where $f$ is either a known parametric function with finite-dimensional parameters $\theta$ , or a highly flexible representation such as a neural network or a Gaussian process, depending on the inference paradigm. The observations $y_j$ at measurement times $t_j$ are modeled as

$y_j = H\, x(t_j; \theta) + \varepsilon_j\,,$

with $H$ a selection or transformation matrix, and $\varepsilon_j$ independent measurement noise (usually Gaussian with covariance $\Gamma$ ).

A Bayesian framework comprises: (i) a prior $p(\theta)$ (or $p(w)$ ) over model parameters or vector field representations, (ii) a likelihood $p(y \mid x(\theta))$ expressing the probabilistic connection between observations and model trajectories, and (iii) a posterior $p(\theta \mid y)$ obtained via Bayes' rule.

2. Computational Methodologies

2.1 Numerical Integration and Posterior Construction

When $f$ is nontrivial, $x(t; \theta)$ lacks a closed form and must be approximated numerically. The primary approaches are:

Simulation-based Likelihood ("plug-in" ODE solvers):

Posterior inference relies on repeatedly integrating the ODE (by, e.g., explicit Runge-Kutta or adaptive solvers) across the parameter posterior, with observation noise incorporated at measurement times. This is the backbone of the workflow in recent comparative studies of biological ODE models (Mohammed et al., 19 Nov 2025). Adaptive integrators (RK45, BDF) are routinely used, with fully Bayesian sampling conducted via Hamiltonian Monte Carlo (HMC/NUTS) using dedicated ODE-solvers in Stan or similar environments.

Bayesian Filtering and State-Space Relaxation:

By discretizing time and adding small process noise to the deterministic ODE flow, the problem is posed as inference in a state-space model,

$x_{i+1} = \Phi_h(x_i, \theta) + \eta_i\,,$

with $\Phi_h$ a one-step integrator and $\eta_i$ small Gaussian noise (Lee et al., 2017). This formulation enables sequential Monte Carlo (particle filtering), Rao-Blackwellization, and rapid approximate inference for both parameters and latent trajectories.

Probabilistic Numerics and Kalman Filtering:

When $f$ is sufficiently linear or can be linearized, the ODE solution is modeled as a Gauss–Markov process (e.g., integrated Wiener process for unbounded derivatives, integrated Ornstein-Uhlenbeck for bounded derivatives) (Magnani et al., 2017, Kersting et al., 2016). Kalman filtering or its extensions (EKF, Rauch–Tung–Striebel) facilitate closed-form inference for the mean and covariance of the trajectory at each time point, and the propagation of numerical solver uncertainty is explicit.

2.2 Spline, Collocation, and Surrogate Representations

Spline-Based Two-Step and Collocation Methods:

Observed data are first smoothed via universal basis expansions (B-splines), generating a flexible, nonparametric posterior for the latent trajectory. The ODE parameters $\theta$ are then inferred by minimizing (or integrating) the discrepancy between the derivative of the spline fit and the ODE right-hand side (Bhaumik et al., 2014, Bhaumik et al., 2014, Xu et al., 2023). Bayesian hierarchical structure is imposed by conjugate priors on the spline coefficients and penalty hyperparameters. Efficient Gibbs samplers or VI are deployed, with Bernstein–von Mises theorems guaranteeing $\sqrt{n}$ -contraction rates for $\theta$ .

Integral Collocation:

The integrated ODE is enforced as a prior on the spline curve coefficients, penalizing deviations from the integral constraint (Xu et al., 2023). Combined with a Gaussian observation model, this approach provides robust parameter recovery without repeated ODE solves, particularly appealing for stiff or non-linear systems and sparse data regimes.

Neural/ANN Surrogates:

Deep neural surrogates are fit to approximate the ODE solution $x(t; \phi)$ as a function of parameters and initial conditions, trained on a dense collocation grid seeded by numerical ODE solutions (Kwok et al., 2022). After training, inference proceeds via Laplace approximation or gradient-based MCMC directly over the surrogate, enabling fast posterior computation in high-dimensional settings.

3. Hierarchical Priors and Model Uncertainty Quantification

A distinctive feature of Bayesian ODE inference is the formal quantification of epistemic and aleatoric uncertainties. Various prior structures are deployed:

Weakly/Strongly Informative Priors:

Biological and physical models use domain-informed or weakly informative priors (e.g., truncated Gaussians, half-Cauchy on scale parameters) to regularize inference under partial observability (Mohammed et al., 19 Nov 2025).

Sparsity-Promoting Priors:

For nonparametric neural or GP vector fields $f$ , priors such as elementwise Gaussian shrinkage, horseshoe, or spike-and-slab are imposed to encourage network sparsity and model selection (Boersma et al., 2 Jul 2024).

Joint Uncertainty on Parameters and Discretization Error:

Recent work explicitly models discretization error as a latent Gaussian process with a Markov prior, facilitating joint inference over ODE parameters and solver error variances, thereby preventing overconfident point estimates (Toyota et al., 28 Nov 2025).

Table: Bayesian ODE Model Priors

Framework	Prior Type(s)	Typical Use
Classical parametric (Stan)	Truncated Normal, Half-Cauchy	Kinetic/epidemic models
GP vector-field ODEs	Separable RBF, ARD	Nonparametric system identification
Neural ODEs	Gaussian, Horseshoe, Spike-Slab	Universal approximation
Spline/collocation	Gaussian on coefficients, Gamma	Collocation/integral methods

4. Inference Algorithms and Computational Strategies

Markov Chain Monte Carlo (MCMC):

HMC/NUTS samplers are widely adopted for parameter posteriors, especially when the posterior surfaces are high-dimensional or highly correlated (e.g., ecological or epidemiological compartmental models) (Mohammed et al., 19 Nov 2025, Dandekar et al., 2020). Gradient computation is performed via continuous adjoint sensitivity analysis, with the adjoint ODE backpropagated to efficiently compute sensitivities with $O(1)$ memory cost (Boersma et al., 2 Jul 2024). Robust diagnostics such as Gelman–Rubin R-hat and effective sample size are standard.

Variational Inference (VI):

For scalability, mean-field VI is deployed, as in the Bayesian Neural ODEs with factorized normal posteriors, and variational multiple shooting for GP-ODEs (Boersma et al., 2 Jul 2024, Hegde et al., 2021). ELBO optimization is conducted with stochastic reparameterization gradients. VI is especially advantageous for large neural ODE models with millions of weights.

Filtering and Sequential Monte Carlo:

For models formulated as state-space systems or with time-varying uncertainties, Bayesian particle filtering or extended Kalman smoothing is used (Lee et al., 2017, Ajmal et al., 2019, Schmidt et al., 2021, Toyota et al., 28 Nov 2025). Self-organizing filters allow for online parameter learning and latent state reconstruction in parallel.

Importance Sampling Corrections:

Post-hoc importance sampling is essential to correct for bias induced by using approximate ODE solvers within MCMC; this reweights posterior samples against higher-accuracy solutions and diagnoses solver-induced bias using the Pareto-smoothed importance sampling shape parameter $\hat k$ (Timonen et al., 2022).

5. Predictive Posterior and Out-of-Sample Uncertainty

For all Bayesian ODE workflows, the posterior predictive for new data or future time points is computed as

$p(x(t_*) \mid \mathcal D) = \int p(x(t_*) \mid w)\, q(w)\,dw\,,$

where $q(w)$ is the posterior (MCMC or variational) over weights or parameters. Sample trajectories are integrated forward and their empirical distribution provides point estimates and credible bands, with full uncertainty propagation from measurement, parameter, and/or model errors. In state-space filtering approaches, the marginal at time $t_*$ is a Gaussian with mean and covariance propagated by the filter or smoother. For Bayesian neural ODEs, the ensemble of predictions from sampled weight vectors gives nonparametric state uncertainties (Boersma et al., 2 Jul 2024, Dandekar et al., 2020).

6. Extensions and Generalizations

Nonparametric and Data-Driven Vector Fields:

Gaussian-process priors over the ODE vector field $f$ enable nonparametric system identification. Sparse variational GP methods with inducing points scale GP-ODE posterior inference and allow for uncertainty quantification over the entire flow field (Hegde et al., 2021). Hybrid approaches (Universal ODEs) embed neural networks or GPs as additive corrections to mechanistic $f$ , enabling joint mechanistic/data-driven inference (Dandekar et al., 2020).

Latent-Force and Nonstationary Models:

Probabilistic state-space constructs, with independent stochastic processes driving both the state and latent forces, enable joint inference on time-dependent or function-valued inputs (Schmidt et al., 2021).

Structural Identifiability and Practical Inference:

Bayesian methods clarify which parameters are structurally and practically identifiable given observability constraints, with strong priors providing needed regularization in ill-posed or partially observed settings (Mohammed et al., 19 Nov 2025).

Efficient Handling of Stiff or High-Dimensional Systems:

Probabilistic numerical methods (Bayesian filtering, quadrature filtering, particle smoothing) and surrogate modeling (deep or linear networks) enable fast computation for stiff, large-scale, or nonlinearly forced systems, without repeated high-precision ODE solutions at each MCMC microstep (Hegde et al., 2021, Kwok et al., 2022, Kersting et al., 2016). Relaxation and process-noise augmentations allow sequential inference even when analytic solutions are infeasible (Lee et al., 2017, Ajmal et al., 2019).

7. Empirical Performance and Comparative Results

Empirical evaluations consistently demonstrate that Bayesian ODE frameworks:

Achieve accurate parameter recovery and credible uncertainty assessment, with posterior intervals that track the true parameter even under sparse or noisy observations (Mohammed et al., 19 Nov 2025, Xu et al., 2023, Yang et al., 2020).
Outperform frequentist plug-in approaches when latent-state uncertainty is large or when data are only partially observed (Mohammed et al., 19 Nov 2025).
Retain correct coverage of forecast uncertainty (95% prediction intervals, weighted interval scores) in simulation and real biological datasets.
Exhibit favorable computational scaling and stability using collocation, variational, or surrogate-based approaches, relative to classical MCMC with embedded ODE solvers (Kwok et al., 2022, Yang et al., 2020).
Provide robust quantification of both parameter and numerical solver uncertainty, avoiding overconfidence that arises when discretization error is ignored (Toyota et al., 28 Nov 2025, Timonen et al., 2022).

The field continues to advance rapidly, with methodological innovations in sparse neural ODEs, nonparametric vector field posteriors, solver-error modeling, and hybrid mechanistic/data-driven frameworks, as demonstrated across controlled simulation and real-world systems (Boersma et al., 2 Jul 2024, Hegde et al., 2021, Toyota et al., 28 Nov 2025).