Papers
Topics
Authors
Recent
2000 character limit reached

Physics-Informed ML via cKLE Methods

Updated 14 October 2025
  • Physics-informed machine learning integrates observed data with governing PDEs to produce consistent estimations and field reconstructions.
  • The cKLE method conditions Gaussian processes on data, yielding a low-dimensional and uncertainty-aware representation for inverse problems.
  • By minimizing PDE residuals with regularization, the approach improves over MAP and PINN methods, offering scalable and physically robust solutions.

Physics-informed machine learning (PIML) methods are a class of computational frameworks that integrate domain knowledge in the form of physical laws—typically partial differential equations (PDEs)—with data-driven statistical or machine learning models. By enforcing consistency with known governing equations while leveraging observed or simulated data, these approaches enable more accurate parameter estimation, field reconstruction, and inversion of physical systems where direct measurement or complete information is not available. Notable among these is the conditional Karhunen–Loève expansion (cKLE)-based method, which provides a low-dimensional, physically consistent representation for inverse problems involving spatially heterogeneous or partially observed physical fields.

1. Conditional Karhunen–Loève Expansions in Physics-Informed ML

A central concept in PIML for spatially distributed systems is the representation of uncertain or partially known fields using conditional Karhunen–Loève expansions. In a standard KLE, a random field is expanded as a sum of orthogonal eigenfunctions (obtained from the eigendecomposition of a covariance operator) modulated by independent random variables. The cKLE approach enhances this by conditioning the expansion on available measurement data, using Gaussian process regression (GPR) to determine the conditional mean and covariance. The resulting expansion is guaranteed to match the observed values at the measurement sites, and the conditional covariance reflects any uncertainty due to sparse observation.

Let zz denote a Gaussian random field on domain DD with mean zˉ(x)\bar z(x), covariance C(x,x)C(x,x'), and measurements zsz_s at locations XsX_s. The conditional mean and covariance are given by

zˉc(x)=zˉ(x)+C(x,Xs)Cs1(zszˉ(Xs)),\bar z^c(x) = \bar z(x) + C(x,X_s) C_s^{-1} (z_s - \bar z(X_s)),

Cc(x,x)=C(x,x)C(x,Xs)Cs1C(Xs,x),C^c(x,x') = C(x,x') - C(x,X_s) C_s^{-1} C(X_s,x'),

where CsC_s is the (possibly noise-augmented) covariance matrix among measurements. The cKLE for zz then reads

zc(x,ξ)=zˉc(x)+i=1Mϕi(x)λiξi,z^c(x,\xi) = \bar z^c(x) + \sum_{i=1}^M \phi_i(x) \sqrt{\lambda_i} \xi_i,

where (ϕi,λi)(\phi_i, \lambda_i) are eigenpairs of the conditional covariance operator, and ξi\xi_i are independent standard Gaussian variables.

2. Physics-Constrained Optimization via PDE Residual Minimization

To ensure that the reconstructed parameter and state fields are consistent with the underlying physics, the method minimizes the residual of the discretized governing equations, such as PDEs, evaluated at selected points in the domain. Given cKLE representations for both the parameter field yy and the state uu: yc(x,ξ)=yˉc(x)+ψy(x)Tξ,uc(x,η)=uˉc(x)+ψu(x)Tη,y^c(x, \xi) = \bar y^c(x) + \psi_y(x)^T \xi, \quad u^c(x, \eta) = \bar u^c(x) + \psi_u(x)^T \eta, the coefficients ξ,η\xi, \eta are obtained by solving: minξ,η  r[uc(,η),yc(,ξ)]22+γ(ξ22+η22),\min_{\xi, \eta} \; \| r[u^c(\cdot,\eta), y^c(\cdot,\xi)] \|_2^2 + \gamma (\|\xi\|_2^2 + \|\eta\|_2^2), where rr is the vector of PDE residuals evaluated at chosen domain points, and γ\gamma enforces regularization. This formulation ensures adherence to both data and physical law, with the cKLE providing a data-compatible, low-dimensional basis and the minimization integrating physical constraints.

3. Covariance Model Calibration and cKLE Construction

The construction of cKLEs requires specification and calibration of covariance models for both the parameter and the state fields. For the parameter field (e.g., log-diffusivity yy in a diffusion equation), standard choices—Gaussian, Matérn, or exponential kernels—are adopted and their hyperparameters (correlation length, variance) fitted to observed data via marginal likelihood maximization or cross-validation. For the state field uu, where physical constraints must hold, an ensemble approach is used: parameter realizations from the cKLE are drawn, PDEs solved for each, and sample mean and covariance of uu are computed from the ensemble. The conditional mean and covariance for uu are then estimated, enabling a cKLE parameterization for the state as well.

For discontinuous parameter fields, a latent continuous field ff can be modeled with a smooth covariance, and the final (e.g., piecewise-constant) parameter is obtained through a suitable nonlinear transform (logistic or expit), capturing sharp interfaces more effectively than direct KLE expansion.

4. Comparison with MAP Estimation and PINNs

The physics-informed cKLE approach—often referred to as PICKLE (Physics-Informed Conditional KLE)—is compared against two dominant inverse problem approaches:

  • Maximum a posteriori (MAP) estimation: MAP directly discretizes the domain and solves for parameter or state values at each mesh point, resulting in highly parameterized problems whose dimensionality rises with mesh resolution. MAP solutions are vulnerable to overfitting and produce “peaky” reconstructions near observation points.
  • Physics-Informed Neural Networks (PINNs): PINNs represent state and/or parameter fields with neural networks whose loss includes data misfit and PDE residual penalties. While flexible, PINN parameterizations may not capture inherent spatial correlations and often produce less smooth reconstructions.

PICKLE demonstrates both lower relative 2\ell_2 errors—up to 320% smaller than MAP in certain continuous field cases—and physically realistic parameter/state reconstructions. For parameter fields with discontinuities, PICKLE’s latent field approach captures sharp transitions more effectively than MAP.

5. Computational and Practical Considerations

A key computational advantage of the cKLE-based approach is the decoupling of parameterization dimension from mesh resolution. While MAP scales as NmeshN_{mesh} (number of mesh points), the number of retained KL terms in PICKLE is a function of the field’s smoothness and the spatial covariance kernel, not the mesh size. This results in

PICKLE costNFV1.15,MAP costNFV3.28,\text{PICKLE cost} \sim N_{FV}^{1.15}, \quad \text{MAP cost} \sim N_{FV}^{3.28},

where NFVN_{FV} is the number of finite volume cells (mesh elements).

Furthermore, the method exhibits flexibility to boundary condition changes—training on one set of Dirichlet data (e.g., river stage) produces a model that can be postprocessed to accommodate new boundary values without retraining, provided the conditional covariance has been estimated for representative scenarios.

6. Limitations, Extensions, and Future Research

While PICKLE offers substantial benefits in computational scaling and data-physics integration, several limitations and directions for future research are noted:

  • The cKLE representation assumes a degree of field smoothness; highly nonsmooth fields may require a higher number of KL terms or more expressive (possibly non-Gaussian) latent representations.
  • The bottleneck in high dimensions lies in the estimation of state covariance, especially for complex or nonlinear systems, motivating research into approximate ensemble methods (e.g., multilevel Monte Carlo) or surrogate models.
  • The approach is inherently calibrated to the chosen covariance model; improved hyperparameter estimation (including transfer learning or hierarchical Bayesian approaches) may enhance generalizability.
  • The extension to transient, nonlinear, or multiscale PDEs and the mitigation of issues such as Gibbs phenomena (for discontinuous fields) remain open.

7. Impact and Significance in Data-Driven Physical Modeling

The physics-informed cKLE method provides an interpretably regularized, low-dimensional mechanism for solving inverse problems by reconciling measurement data with the governing laws of complex physical systems. It advances the state of the art by integrating exact data conditioning, spatial correlation structure, and physics constraints into a unified optimization framework. The approach is situated at the intersection of statistical inference, stochastic modeling, and PDE-constrained optimization, laying a robust foundation for future research in scalable, interpretable, and physically consistent machine learning for scientific and engineering applications (Tartakovsky et al., 2019).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Physics-Informed Machine Learning Method.