Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multi-Fidelity Residual Neural Processes

Updated 8 February 2026
  • Multi-Fidelity Residual Neural Processes (MFRNP) are a deep learning framework for surrogate modeling that explicitly aggregates lower-fidelity predictions and applies a residual correction.
  • They use a two-stage decode–then–residual approach to achieve scalable, high-dimensional emulation and robust out-of-distribution generalization.
  • MFRNP outperforms traditional GP-based and neural surrogates in tasks such as PDEs, climate emulation, and robotics, offering real-time calibration and uncertainty quantification.

Multi-Fidelity Residual Neural Processes (MFRNP) are a deep learning framework designed for surrogate modeling in settings characterized by the availability of outputs from simulators or experiments of multiple fidelities. The core innovation of MFRNP is the explicit modeling and aggregation of decoded predictions from lower-fidelity neural surrogates, followed by residual correction at the highest fidelity. This two-stage “decode–then–residual” approach enables scalable, accurate surrogate models capable of high-dimensional emulation and robust generalization to out-of-distribution (OOD) scenarios, significantly outperforming existing Gaussian Process (GP)-based and neural surrogates across PDEs, real-world climate emulation, and robotics (Niu et al., 2024, Hunter et al., 11 Nov 2025).

1. Model Architecture and Aggregation Strategy

MFRNP decomposes the multi-fidelity surrogate modeling task into two principal components. For KK fidelities, with k=1,2,,K1k=1,2,\ldots,K-1 denoting lower fidelities and KK the highest fidelity:

  • Lower-Fidelity NPs: A separate Neural Process (NP) surrogate is learned for each lower fidelity. Each NP uses an encoder qϕk(zkDkc)q_{\phi_k}(z_k | D_k^c) operating on a split of context/target sets (Dkc,Dkt)(D_k^c, D_k^t), producing a latent zkRdzz_k \in \mathbb{R}^{d_z}, and a decoder pθk(yzk,x)p_{\theta_k}(y | z_k, x) mapping (zk,x)(z_k, x) to output distributions.
  • Decoded Aggregation: For a given input xx, outputs from all K1K-1 lower-fidelity decoders are averaged:

A(x)=1K1m=1K1f^m(x)A(x) = \frac{1}{K-1} \sum_{m=1}^{K-1} \hat{f}_m(x)

where f^m(x)=Eq,p[ymx]\hat{f}_m(x) = \mathbb{E}_{q,p}[y_m | x].

  • Residual NP: Instead of directly modeling the highest fidelity, a residual Neural Process is trained with context {x,R(x)}\{x, R(x)\} over the highest-fidelity inputs XKX_K, where R(x)=fK(x)A(x)R(x) = f_K(x) - A(x). Its encoder $q_{\phi_K}(z_K | z_{1:K-1}, \theta_{1:K-1}, D'_K^c)$ explicitly receives all lower-fidelity latents and decoder parameters, and its decoder pθK(RzK,x)p_{\theta_K}(R|z_K,x) predicts R(x)R(x).

The final high-fidelity prediction is given by aggregation plus residual:

f^K(x)=A(x)+R^(x)\hat{f}_K(x) = A(x) + \hat{R}(x)

where R^(x)=EqϕK,pθK[RzK,x]\hat{R}(x) = \mathbb{E}_{q_{\phi_K},p_{\theta_K}}[R|z_K,x] (Niu et al., 2024).

2. Residual–Evidence Lower Bound and Joint Training

The training objective is a tailored Evidence Lower Bound (Residual-ELBO), reflecting both direct surrogate modeling at lower fidelities and residual learning at the highest fidelity:

  • Fidelity-Specific ELBO, k=1,,K1k=1, \ldots, K-1:

Lkf^=Eqϕk(zkDkcDkt)[(x,y)Dktlogpθk(yzk,x)]KL[qϕk(zkDkcDkt)qϕk(zkDkc)]\mathcal{L}^{\hat{f}}_k = \mathbb{E}_{q_{\phi_k}(z_k|D_k^c \cup D_k^t)} \left[ \sum_{(x,y)\in D_k^t} \log p_{\theta_k}(y|z_k,x) \right] - \mathrm{KL}[q_{\phi_k}(z_k|D_k^c \cup D_k^t) \| q_{\phi_k}(z_k|D_k^c)]

  • Residual ELBO, k=Kk=K:

$\mathcal{L}^R = \mathbb{E}_{q_{\phi_K}(z_K|D'_K^c \cup D'_K^t)} \left[ \sum_{x \in D'_K^t} \log p_{\theta_K}(R(x)|z_K,x) \right] - \mathrm{KL}[q_{\phi_K}(z_K|D'_K^c \cup D'_K^t) \| q_{\phi_K}(z_K|D'_K^c)]$

  • Total Loss: Estimated with Monte Carlo samples:

LMC=LMCf^+LMCR\mathcal{L}_{MC} = \mathcal{L}^{\hat{f}}_{MC} + \mathcal{L}^R_{MC}

This objective necessitates that the highest-fidelity latent zKz_K and its prediction depend on the decoded (not just latent) lower-fidelity information, enforcing both in-fidelity accuracy and cross-fidelity informational coupling (Niu et al., 2024).

3. Inference Pipeline and Extrapolation Capacity

At inference, for an unseen input at the highest fidelity, MFRNP executes the following sequence:

  1. Lower-Fidelity Staging: For each k=1,,K1k = 1, \ldots, K-1, latent samples are drawn from qϕk(zDkc)q_{\phi_k}(z|D_k^c), followed by decoding and linear interpolation as needed, yielding predictions o^k\hat{o}_k on the finest input grid.
  2. Aggregation: A(xKt)=1K1ko^kA(x^t_K) = \frac{1}{K-1} \sum_k \hat{o}_k.
  3. Residual Correction: Draw samples $z_K' \sim q_{\phi_K}(z_K|D'_K^c)$, decode R^(xKt)=pθK(zK,xKt)\hat{R}(x^t_K) = p_{\theta_K}(\cdot | z_K', x^t_K).
  4. Output: f^K(xKt)=A(xKt)+R^(xKt)\hat{f}_K(x^t_K) = A(x^t_K) + \hat{R}(x^t_K).

The residual NP’s encoder, by directly conditioning on decoded lower-fidelity outputs, enables MFRNP to extrapolate beyond the domain of available high-fidelity training data, conferring a substantive OOD generalization advantage (Niu et al., 2024).

4. Theoretical Motivation and Decoder-Based Sharing

Unlike prior multi-fidelity neural surrogates, which share only latent representations (commonly realized as a shared rr variable decoded by fidelity-specific decoders θk\theta_k), MFRNP's explicit sharing of decoder outputs resolves key inconsistencies:

  • Latent Sharing Limitation: Shared latents decoded by different θk\theta_k may yield mutually inconsistent predictions for kik \neq i.
  • Decoder-Aggregation: Incorporating θ1:K1\theta_{1:K-1} into residual learning ensures that all lower-fidelity decoders influence highest-fidelity predictions.
  • No Feed-Forward Error Propagation: By aggregating concrete, decoded predictions, MFRNP avoids the chained error propagation of hierarchical latent-only schemes and mitigates calibration errors and instability.

Ablation analyses confirm that replacing MFRNP’s decode–aggregate module with a purely latent hierarchical aggregator significantly increases prediction error (order-of-magnitude deterioration in normalized RMSE), affirming the necessity of decoder-level information sharing (Niu et al., 2024).

5. Extensions: Physics-Informed and Uncertainty-Calibrated Variants

The Multi-Fidelity Residual Physics-Informed Neural Process (MFR-PINP) generalizes MFRNP by integrating domain-specific physics priors and uncertainty calibration:

  • Architecture: Two parallel branches, the first a low-fidelity NP learning a surrogate to an analytic model g1g_1, the second a residual NP correcting from this surrogate towards high-fidelity ground truth or an enhanced physics prior g2g_2.
  • Physics-Informed Feature Integration: Physics priors g1g_1, g2g_2 are introduced as input features; a frozen copy of the low-fidelity decoder stabilizes the residual learning process.
  • Uncertainty Quantification: Split conformal prediction is employed during inference, yielding prediction intervals with finite-sample coverage guarantees for each state dimension.
  • Output Fusion: High-fidelity prediction is a Gaussian with mean μThigh=μTlow+μTR\mu_T^{high} = \mu_T^{low} + \mu_T^{\mathcal{R}} and variance σhigh2=σlow2+σR2\sigma_{high}^2 = \sigma_{low}^2 + \sigma_{\mathcal{R}}^2 (Hunter et al., 11 Nov 2025).

A plausible implication is that this class of multi-fidelity decoupling and residual assembly architectures allows direct incorporation of physical knowledge, robust calibration, and interpretable uncertainty bounds in real-time applications such as robotics and control.

6. Empirical Performance and Benchmarks

MFRNP demonstrates superior empirical performance across multiple domains:

Task & Domain Metric Best MFRNP Performance Baseline Range Relative Improvement
PDE Surrogates (Heat/Poisson/Fluid) nRMSE (full domain) 0.004–0.007 0.10–0.38 ∼90% error reduction
PDE Surrogates (OOD) nRMSE 0.005–0.018 0.14–0.75 Outperforms all baselines
Climate (CMIP6/ERA5) Weighted nRMSE Lower in 3/4 scenarios All compared baselines More stable over time
Robotics (State Estimation) [PINP] Per-step RMSE/NLL 0.154, NLL –1.274 DKF RMSE 0.958, NLL –0.192 Best in both accuracy and fidelity

Baselines in these studies include single-fidelity NPs, MF GPs (NARGP), hierarchical and disentangled MF NPs (MFHNP, D-MFD), Deep Multi-fidelity Active Learning (DMF), neural network GPs, and domain-specific frameworks such as DeepESD (Niu et al., 2024, Hunter et al., 11 Nov 2025).

MFRNP's advantage is most pronounced in complex, high-dimensional PDEs and under OOD configurations where high-fidelity data is sparse or localized. In climate emulation, MFRNP maintains consistent prediction accuracy over multi-decadal time ranges and at fine spatial resolutions, with baseline methods exhibiting degraded performance as projections extend (Niu et al., 2024).

In real-time robotic state estimation, MFR-PINP achieves lower test RMSE and better negative log-likelihood than transformer-based Deep Kalman Filters, with all neural approaches meeting strict real-time requirements on hardware-constrained platforms (Hunter et al., 11 Nov 2025).

7. Broader Implications, Limitations, and Future Directions

MFRNP offers a general-purpose, scalable surrogate modeling paradigm with strong performance in high-dimensional, multi-fidelity, and OOD regimes. Its architecture systematically leverages all available fidelity levels via decoded aggregation and residual modeling. MFRNP has already demonstrated advantages in physics-based emulation, sensor fusion, and real-time tracking and is extensible to broader domains including aerodynamics, structural health monitoring, and environmental forecasting, where rapid adaptation, physical consistency, and calibrated uncertainty are essential (Niu et al., 2024, Hunter et al., 11 Nov 2025).

Current limitations center on data efficiency—especially the acquisition cost for high-fidelity ground truth—as well as the need for occasional human oversight in model recalibration for safety-critical scenarios. Future directions include hybrid training with synthetic data to reduce sim-to-real gaps, adaptive conformal updates for tighter uncertainty quantification, and application to multi-agent and control-theoretic domains.

MFRNP's combination of decoded aggregation and residual correction establishes it as a foundational tool for modern multi-fidelity surrogate modeling, with unique capabilities to scale, generalize, and calibrate across diverse, high-stakes scientific and engineering tasks.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Multi-Fidelity Residual Neural Processes (MFRNP).