Probabilistic ODE Solvers

Updated 18 June 2026

Probabilistic ODE solvers are numerical methods that return full posterior distributions instead of point estimates, encapsulating discretization and model uncertainty.
They employ Bayesian filtering and smoothing techniques (e.g., EKF/EKS, particle filters) to iteratively update hidden Gauss–Markov process states in the integration process.
Advances in calibration, adaptive step-sizing, and scalable high-dimensional implementations enhance parameter inference, optimal control, and practical uncertainty quantification.

Probabilistic ordinary differential equation (ODE) solvers constitute a numerical integration paradigm in which numerical uncertainty is quantified via probability measures over the ODE solution. Unlike classical (deterministic) ODE solvers that return discrete point estimates, probabilistic solvers produce posterior distributions—typically Gaussian or more general measures—over the solution trajectories, encoding both the mean and credible intervals that reflect epistemic uncertainty due to discretization and, in advanced settings, model or parameter uncertainty. Modern frameworks unify concepts from stochastic processes, state-space modeling, Bayesian filtering and smoothing, and Monte Carlo sampling, thereby enabling principled uncertainty quantification, adaptive discretization, scalable high-dimensional computation, and enhanced parameter inference.

1. Mathematical Formulation and Bayesian Filtering Framework

Core filtering-based probabilistic ODE solvers model the solution trajectory and its derivatives as a hidden Gauss–Markov process with dynamics specified by a linear time-invariant SDE (e.g., the q-times integrated Wiener process). The ODE is enforced via “information operators” or pseudo-measurements of residuals, typically encoded as:

Initial-value problem: $\dot{x}(t) = f(x(t), t),\; x(0) = x_0$
State augmentation: $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$
Prior SDE: $dX = FX\,dt + L\,dW$ (where $F$ is the companion matrix; $L$ injects noise into the highest derivative)
Discrete-time transitions: $X_{n+1} | X_n \sim N(A(h)X_n, Q(h))$ (with $A(h) = \exp(Fh)$ ; $Q(h)$ obtained via matrix differential Lyapunov equation)
Pseudo-observations: at each time step, enforce $z_n = E_1 X_n - f(E_0 X_n, t_n) = 0$ , where $E_0$ / $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 0 select the state and its derivative.

Posterior inference is performed using a (extended/iterated) Kalman filter/smoother, with the mean providing the point estimate and the covariance encoding the discretization uncertainty (Krämer et al., 2020, Bosch et al., 2020, Bosch, 6 Jun 2026). Nonlinearity in $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 1 induces filtering with either linearizations (EKF/EKS), sigma-point/unscented/quadrature filters, or—in more general contexts—non-Gaussian particle filters (Tronarp et al., 2018).

2. Model and Discretization Uncertainty Quantification

Probabilistic solvers return not just pointwise approximations but full distributions, providing a principled mechanism for uncertainty quantification (UQ). The posterior covariance at each grid point quantifies the conditional variance induced by discretization and model interrogation (Lahr et al., 2024, Bosch et al., 2020).

Mean: $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 2 is the posterior mean for $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 3.
Covariance: $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 4 contracts as $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 5 under a $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 6th-order model.
Calibration: Quasi-maximum-likelihood estimation of process noise parameters ( $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 7 or $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 8) is used to match residuals, with both global and time-varying calibrations available (Bosch et al., 2020, Krämer et al., 2020).

Uncertainty estimates are efficiently propagated through dynamical systems and higher-level computational tasks such as optimal control; for instance, in optimal control setups the uncertainty penalty $X(t) = [x(t), x^{(1)}(t), ..., x^{(q)}(t)]^\top$ 9 augments the cost function, enabling UQ-aware policy computation (Lahr et al., 2024).

3. Algorithms, Implementation, and High-Dimensional Scaling

Algorithmic advances underpin efficient deployment of probabilistic ODE solvers:

Prediction-Update (EKF/EKS): Forward-propagate predicted means/covariances; update via linearized ODE residuals (Krämer et al., 2020).
Smoothing (RTS/IEKS): Backward-pass yields marginal posteriors for all timepoints with improved covariance estimates (Krämer et al., 2020, Bosch et al., 2023).
Matrix-Free and Block-Diagonal Updates: For high-dimensional systems, implement mean and covariance updates via Jacobian-vector products and stochastic block-diagonal estimators to achieve $dX = FX\,dt + L\,dW$ 0 per-step cost (Bosch, 6 Jun 2026, Krämer et al., 2021).
Kronecker Structures: Leverage Kronecker structure in the prior for ODE/PDE discretizations with millions of variables (Krämer et al., 2021).
Parallel-in-Time Schemes: Use associative scan routines for $dX = FX\,dt + L\,dW$ 1 span, enabling massive parallelization on GPUs (Bosch et al., 2023).

Stability is ensured via innovations such as step-size-independent preconditioning, square-root (Cholesky/QR) implementations, and iterative re-linearization (matrix-free implicit schemes), with modern solvers remaining A-stable or L-stable even for stiff, high-dimensional systems (Krämer et al., 2020, Bosch, 6 Jun 2026).

4. Probabilistic Representations: Sampling and Butcher Tree Expansions

Beyond filter-based approaches, a distinct family leverages the Butcher (B-series) expansions for deterministic ODEs as expectations over random trees:

Butcher Trees and Random Generation: Express $dX = FX\,dt + L\,dW$ 2, where $dX = FX\,dt + L\,dW$ 3 is a randomly generated tree (size and structure sampled), $dX = FX\,dt + L\,dW$ 4 is the elementary differential, and $dX = FX\,dt + L\,dW$ 5 is a user-chosen size distribution (Huang et al., 2024).
Branching Process Representation: Marked branching processes with explicit integrability conditions and explosion-time bounds provide a probabilistic foundation for existence/uniqueness criteria and enable unbiased Monte Carlo estimators (Huang et al., 15 Feb 2025).
Convergence Guarantees: Under bounded derivatives and suitable size distributions (geometric), root-mean-square error decays as O( $dX = FX\,dt + L\,dW$ 6) with $dX = FX\,dt + L\,dW$ 7 samples (Huang et al., 2024).

This methodology is particularly attractive for high-accuracy regimes or when classical high-order truncations are computationally infeasible due to combinatorial explosion.

5. Calibration, Adaptivity, and Incorporation of Structural Information

Probabilistic solvers can include multiple sources of information to enhance accuracy and physical fidelity:

Calibration: Joint estimation of process-noise intensities, locally or globally, aligns ensemble dispersion with true error statistics. Time-/component-wise calibration improves contractivity and reduces both under- and over-confidence (Bosch et al., 2020).
Adaptive Step Size: The contracted posterior covariance provides a local error estimate for step-size controllers, permitting fully adaptive and calibrated high-order integration (Bosch et al., 2020).
Information Operators: Flexible inclusion of higher-order ODEs, physical constraints (e.g., energy conservation), algebraic equations, or DAEs is achieved by augmenting the pseudo-measurement step with general “information operators,” linearized and included in the filtering update (Bosch et al., 2021).
SDEs and Model/Parameter Uncertainty: Some frameworks propagate both numerical and model uncertainty, e.g., using marginalization over uncertain parameters or Brownian increments (for SDEs), yielding ensemble or closed-form Gaussian transition densities (Fay et al., 2023, Yao et al., 6 Mar 2025).

6. Practical Applications and Computational Advantages

Probabilistic ODE solvers find key applications in:

Parameter Inference: When used as forward solvers in statistical inference (MAP, Laplace, or Bayesian), probabilistic solvers yield likelihoods with built-in uncertainty quantification, improving robustness and calibration in ill-posed or stiff inverse problems (Wu et al., 2023, Wu et al., 26 Jun 2025).
Optimal Control: Posterior covariances can be propagated in cost functions for integration-error-aware control, producing UQ-informed optimal control laws and more reliable dynamical predictions under coarse discretization (Lahr et al., 2024).
Large-Scale Simulation: With matrix-free, block-structured implementation, probabilistic solvers scale to ODEs/PDEs in millions of dimensions while preserving uncertainty quantification and stability characteristics (Krämer et al., 2021, Bosch, 6 Jun 2026).

7. Future Directions and Open Challenges

Research continues into several challenges and future extensions:

Model Uncertainty Propagation: Efficient, quadrature-driven SME solvers for joint propagation of parameter/model and numerical uncertainty, especially critical for high-dimensional likelihood computations (Yao et al., 6 Mar 2025).
Sampling-Based Representations: Development of Monte Carlo solvers based on random tree structures for unbiased high-order integration, including rigorous explosion-time controls and quantifiable bias-variance tradeoff (Huang et al., 15 Feb 2025, Huang et al., 2024).
Parallel and PinT Extensions: Probabilistic parareal methods and other PinT strategies combine parallel scalability with nontrivial uncertainty quantification, including recursive variance propagation and early stopping criteria (Gattiglio et al., 4 Sep 2025).
Implicit and Stiff Integrators: Iterative, implicit matrix-free solvers deliver scalable A-stability for extremely stiff large-scale dynamics, outperforming earlier probabilistic methods on PDE tests (Bosch, 6 Jun 2026).
Open Software and Benchmarks: Modern toolkits such as “rodeo” standardize fast, differentiable, and scalable probabilistic solvers, providing support for a range of statistical inference workflows (Wu et al., 26 Jun 2025).

Open theoretical and algorithmic directions include fully rigorous convergence guarantees for data-adaptive likelihood approximations under non-Gaussian noise (Wu et al., 2023), integration of structured kernels or non-Gaussian priors in high-dimensional filtering, and further development of efficient implicit solvers for stiff and DAEs at scale.

Key References:

(Krämer et al., 2020) Stable Implementation of Probabilistic ODE Solvers
(Bosch et al., 2020) Calibrated Adaptive Probabilistic ODE Solvers
(Krämer et al., 2021) Probabilistic ODE Solutions in Millions of Dimensions
(Bosch et al., 2021) Pick-and-Mix Information Operators for Probabilistic ODE Solvers
(Lahr et al., 2024) Probabilistic ODE Solvers for Integration Error-Aware Numerical Optimal Control
(Huang et al., 15 Feb 2025) Probabilistic representation of ODE solutions with quantitative estimates
(Huang et al., 2024) On the random generation of Butcher trees
(Bosch, 6 Jun 2026) Stable and Scalable Probabilistic Numerical Solvers for Stiff and High-Dimensional ODEs
(Gattiglio et al., 4 Sep 2025) Prob-GParareal: A Probabilistic Numerical Parallel-in-Time Solver for Differential Equations
(Wu et al., 2023) Data-Adaptive Probabilistic Likelihood Approximation for Ordinary Differential Equations
(Wu et al., 26 Jun 2025) rodeo: Probabilistic Methods of Parameter Inference for Ordinary Differential Equations