Polynomial Chaos-Kriging Surrogates

Updated 22 December 2025

PCK is a hybrid surrogate modeling technique that combines global polynomial chaos expansions with local Gaussian process regression to provide high-fidelity predictions and uncertainty quantification.
It leverages sparse polynomial basis selection via LAR and robust kernel fitting to effectively model high-dimensional, noisy, or non-smooth systems while avoiding overfitting.
PCK surrogates enable efficient Bayesian inference, global optimization, and risk analysis, offering significant speedups and accurate uncertainty estimates in complex simulations.

Polynomial Chaos–Kriging (PCK) is a hybrid surrogate modelling methodology that combines the global approximation capability of polynomial chaos expansions (PCE) with the local fidelity and uncertainty quantification of universal Kriging (UK) or Gaussian process (GP) regression. PCK surrogates arise in contexts where computationally expensive models—for example, stochastic simulators in engineering, physics-based forward models in planetary science, or optimization routines in wind energy layout—must be emulated for tasks such as uncertainty quantification, Bayesian inference, and efficient global optimization. The PCK paradigm has undergone significant refinement since its introduction, with advancements in sparse polynomial basis selection via least angle regression (LAR), robust kernel fitting, and algorithmic scalability for high-dimensional, noisy, or non-smooth problems (García-Marino et al., 5 Feb 2024, Schoebi et al., 2015, Wringer et al., 19 Dec 2025, Palar et al., 2018, Lee et al., 2022, García-Merino et al., 2022, Shao et al., 16 Feb 2025).

1. Mathematical Formulation

A PCK surrogate for a computational model $\mathcal{M}(x)$ is written as

$\widehat{Y}(x) = \underbrace{\sum_{\alpha\in\mathcal{A}} a_\alpha \Psi_\alpha(x)}_{\text{PCE trend}} + \underbrace{Z(x)}_{\text{Gaussian-process residual}}$

where:

$(\Psi_\alpha(x))$ is a multi-variate polynomial basis, orthonormal with respect to the input probability measure.
$\mathcal{A}$ is a sparse truncation set of multi-indices, often selected by total-degree or hyperbolic-norm rules and adaptively pruned.
$a_\alpha$ are trend coefficients estimated via regression.
$Z(x)\sim\mathcal{GP}(0, \sigma^2 R(x, x'; \theta))$ is a zero-mean GP with process variance $\sigma^2$ and kernel hyperparameters $\theta$ (e.g., Gaussian or Matérn).

This structure maximizes "global" approximation through the PCE trend and "local" correction via the GP residual. The covariance kernel, $R(x,x';\theta)$ , is typically chosen to accommodate the smoothness or roughness of the model output and is fitted by maximizing either a log-likelihood or leave-one-out cross-validation error (García-Marino et al., 5 Feb 2024, Schoebi et al., 2015, Wringer et al., 19 Dec 2025, Palar et al., 2018).

2. Sparse Polynomial Basis Selection and Model Training

Crucial to PCK effectiveness is sparse selection of the polynomial trend terms, achieved via Least Angle Regression (LAR), LASSO, or similar techniques:

All candidate polynomials up to a user-specified maximal degree $p$ and $q$ -norm (often via a hyperbolic truncation) are generated.
LAR iteratively adds the polynomial basis most correlated with the current residual to the active set, updating coefficients via regression and evaluating generalization via leave-one-out (LOO) error minimization.
The optimal subset $\mathcal{A}^*$ is selected at the point of minimal LOO-CV error, avoiding overfitting inherent to full PCE bases (García-Marino et al., 5 Feb 2024, Schoebi et al., 2015).
Model hyperparameters $(\theta, \sigma^2, \beta)$ for the GP are set by maximizing the log-likelihood

$\ell(\beta,\sigma^2,\theta) = -\frac{1}{2}\log|\Sigma_Z+\Sigma_\varepsilon| -\frac{1}{2}(\bar{\mathcal M}-\Psi\beta)^\top(\Sigma_Z+\Sigma_\varepsilon)^{-1}(\bar{\mathcal M}-\Psi\beta)$

where $\Sigma_Z$ and $\Sigma_\varepsilon$ respectively capture the process covariance and noise at the design points.

3. Prediction and Uncertainty Quantification

For a new input $x^*$ , the PCK prediction and its uncertainty are given as:

$\widehat{Y}(x^*) = \Psi(x^*)^\top\hat\beta + r_Z(x^*)^\top (\Sigma_Z+\Sigma_\varepsilon)^{-1} (\bar{\mathcal M} - \Psi\,\hat\beta)$

$\widehat{\mathrm{MSE}}(x^*) = \sigma^2 - R(x^*,x^*)^\top (\Sigma_Z+\Sigma_\varepsilon)^{-1} R(x^*,\cdot)$

where $r_Z(x^*) = [R(x^*, x^{(i)})]^\top$ quantifies local correlation with training data. Uncertainty in PCK is rigorously decomposed between extrinsic sources (from $Z(x)$ ) and intrinsic noise (from measurement uncertainty, modeled by $\Sigma_\varepsilon$ ) (García-Marino et al., 5 Feb 2024, Schoebi et al., 2015).

4. Integration into Bayesian Inference, Optimization, and Reliability

PCK surrogates are widely integrated in advanced workflows:

Bayesian inference: Within Markov chain Monte Carlo (MCMC), the expensive forward model is replaced by a PCK surrogate, yielding dramatic computational speedup (e.g., factor $\sim$ 320 in exoplanet interior characterization). Surrogate error is statistically propagated by inflating the likelihood variance (Wringer et al., 19 Dec 2025, García-Merino et al., 2022).
Efficient Global Optimization (EGO): The Expected Improvement (EI) acquisition function is computed using both the mean and variance from the PCK surrogate. Automatic trend selection via LARS/LOO tends to outperform both pure Kriging and blind polynomial selection when the underlying response exhibits moderate polynomial structure, but in highly rugged cases, a constant trend may be preferable (Palar et al., 2018, Shao et al., 16 Feb 2025).
Reliability and risk: In high-dimensional settings, dimensionally decomposed PCEs merged with Kriging (DD-GPCE-Kriging) enable scalable estimation for quantities like Conditional Value-at-Risk (CVaR), achieving up to $10^4$ -fold speedups via multifidelity importance sampling (Lee et al., 2022).

5. Domain Partitioning and High-Dimensional Extensions

For non-smooth, highly non-linear, or high-dimensional models:

Multielement PCK: The input space is partitioned into $J$ non-overlapping subdomains; independent PCK surrogates are fit locally and assembled piecewise. This approach maintains local adaptation capabilities—fitting sharp transitions and discontinuities—while balancing computational tractability via efficient domain allocation (García-Merino et al., 2022).
Dimensionally decomposed PCEs: Basis reduction is achieved by restricting polynomial interactions to at most $S$ -variate terms, dramatically decreasing the number of basis functions and enabling applications with $N\geq20$ dimensions while maintaining accuracy and computational tractability (Lee et al., 2022).

6. Empirical Performance and Validation Benchmarks

Empirical studies across a range of disciplines quantify the advantages of PCK:

Case Study	PCK vs. SK/Kriging	Surrogate Error	Computational Speedup
Stochastic queue (M/M/1), Egg-box, Ishigami	RMSE improvement: 20–74%; NMAE: 8–60%	RMSE $\sim$ 0.5–1%	1–2 orders of magnitude
Exoplanet inversion	$R^2 > 0.99$ ; coverage: 93–96%	Error $\ll$ data uncertainty	$\sim$ 320×
Wind farm layout optimization	R² > 0.99; out-of-sample RMSE < 0.5%	Sub-percent RMSE	10–500×

These findings consistently show that sparse polynomial trend selection (LAR, LASSO) is critical: full PCEs tend to overfit, while sparse, cross-validated selection ensures robust accuracy and generalization. PCK is especially advantageous when experimental design size is limited or intrinsic noise is significant (García-Marino et al., 5 Feb 2024, Schoebi et al., 2015, Wringer et al., 19 Dec 2025, Palar et al., 2018, Shao et al., 16 Feb 2025).

7. Practical Guidelines and Implementation Notes

Best practices for deploying PCK surrogates include:

Always apply LAR or LASSO to candidate polynomials for trend selection; full PCE basis should generally be avoided (García-Marino et al., 5 Feb 2024, Schoebi et al., 2015).
For high-dimensional applications, impose total-degree or dimensionally decomposed truncations and use space-filling experimental designs (e.g., Latin Hypercube Sampling).
Kernel selection should reflect the smoothness and structure of $Z(x)$ ; Gaussian is default for smooth models, Matérn for rough behaviour, and a nugget is advisable with intrinsic noise (Wringer et al., 19 Dec 2025, Palar et al., 2018).
Surrogate accuracy must be validated on a large hold-out sample via root-mean-square error (RMSE), normalized mean absolute error (NMAE), and confidence interval coverage rates. For reliability studies, integration into multifidelity sampling frameworks is recommended (Lee et al., 2022, García-Merino et al., 2022).
Hyperparameters in universal Kriging are optimally set via maximum likelihood—which often necessitates global optimization (genetic algorithms, BFGS) for non-convex log-likelihoods (García-Marino et al., 5 Feb 2024, Schoebi et al., 2015).
Surrogate uncertainty should be propagated in downstream tasks by explicit variance inflation, ensuring statistically robust inference and optimization convergence (Wringer et al., 19 Dec 2025, García-Merino et al., 2022).

References to Core Developments

Sparse trend selection, universal/stochastic Kriging: (García-Marino et al., 5 Feb 2024, Schoebi et al., 2015)
Surrogate-accelerated Bayesian inversion, workflow guidance: (Wringer et al., 19 Dec 2025, García-Merino et al., 2022)
Efficient global optimization with expected improvement: (Palar et al., 2018, Shao et al., 16 Feb 2025)
High-dimensional and domain-decomposed PCK: (Lee et al., 2022, García-Merino et al., 2022)

Polynomial Chaos–Kriging therefore constitutes a robust, statistically principled, and efficient framework for surrogate modelling of expensive, stochastic, and high-dimensional simulators, delivering high-fidelity predictions and calibrated uncertainties with modest experimental designs and scalable computational cost.