Multi-Fidelity Deep Learning Framework

Updated 22 December 2025

Multi-fidelity deep learning frameworks are methods that integrate abundant low-fidelity and scarce high-fidelity data using weighted loss fusion, transfer learning, and residual aggregation.
They employ diverse architectures—from feed-forward and physics-informed neural networks to deep Gaussian processes and graph networks—to capture complex, high-dimensional phenomena.
These frameworks dramatically reduce training costs and enhance uncertainty quantification, enabling robust generalization and efficient scientific and engineering computations.

Multi-fidelity deep learning frameworks integrate data or simulators of varying accuracies and costs to build surrogates, control agents, or uncertainty-quantified models for high-dimensional scientific and engineering domains. They balance abundant, inexpensive low-fidelity data—typically approximate physics models or coarse numerical simulations—with scarce, expensive high-fidelity sources such as detailed experiments or fine-grid calculations. Architecturally, these frameworks span deep feed-forward neural networks, physics-informed neural networks, transfer or residual neural processes, deep and conditional Gaussian processes, generative flows, graph neural networks, and active learning loop designs. Mathematical approaches include loss-function fusion with weighted fidelity penalties, hierarchical information flow, latent-space aggregation, and compositional kernel structures. Multi-fidelity paradigms achieve dramatic reductions in training cost and data requirements, robust generalization to out-of-distribution regimes, and actionable uncertainty quantification, making them essential for domains where high-accuracy data are expensive or limited.

1. Network Architectures and Fusion Strategies

Modern multi-fidelity deep learning architectures are categorized by their mechanisms for fusing data across fidelities.

Weighted Loss Fusion in Feed-forward DNNs: In aerodynamic surrogate modeling, a DNN receives physically relevant normalized inputs and uses a single composite loss:

$J_{\mathrm{MF}}(\theta) = \frac{1}{l}\sum_{j=1}^l (\hat{y}_j^{\mathrm{exp}} - y_j^{\mathrm{exp}})^2 + \frac{\rho}{q}\sum_{i=1}^q (\hat{y}_i^{\mathrm{com}} - y_i^{\mathrm{com}})^2$

where $\rho$ controls the balance of high-fidelity (experiment) and low-fidelity (CFD) error terms (Li et al., 2021). This approach scales with the number of fidelity levels via further loss terms.

Autoencoder-based Transfer Learning: Abundant low-fidelity data are encoded into a compact latent representation via a multi-layer autoencoder. The decoder and (if needed) an up-scaler are then fine-tuned exclusively on scarce high-fidelity data, with the encoder weights frozen. This induces a physics-aligned latent manifold that leverages low-fidelity coverage and corrects systematic LF-HF discrepancies by minimal fine-tuning (Nieto-Centenero et al., 15 Dec 2025).
Residual Aggregation in Neural Processes: Multi-fidelity Residual Neural Processes explicitly decode lower-fidelity NP surrogates at high-fidelity points, average their outputs, and learn a residual correction at the highest fidelity:

$\hat{f}^{(K)}(x) = g(f^{(1..K-1)}(x)) + r^{(K)}(x)$

This enables decoder parameters to be informative for cross-fidelity predictions, yielding improved OOD generalization (Niu et al., 29 Feb 2024).

Concatenated Neural Networks with Physics Injection: Low-fidelity predictions, such as analytic or empirical models, are concatenated into hidden layers of a DNN, enforcing physics-guided inductive bias and reducing uncertainty—enabling more physically consistent extrapolation with less training data (Pawar et al., 2021).

2. Deep Probabilistic and Kernel-based Multi-Fidelity Models

Bayesian approaches structure the exploitation of multi-fidelity data through hierarchical or compositional kernels.

Deep Gaussian Processes (DGP) for Multi-Fidelity: Each layer in a DGP maps one fidelity, propagating both nonlinear transformation and calibrated uncertainty. The composite kernel is:

$k_{\ell}((x, z),(x', z')) = k_{\ell}^\rho(x,x')\,k_{\ell}^{f-1}(z,z') + k_{\ell}^\delta(x,x')$

Variational inference with inducing points enables scaling and full uncertainty propagation. This approach outperforms AR(1) co-kriging and nonlinear GP baselines on many benchmarks (Cutajar et al., 2019).

Conditional DGPs and Moment-Matching Kernels: Conditioning intermediate GPs on fixed low-fidelity data, the marginal prior at the top fidelity collapses into a single GP with an effective kernel:

$K_{\rm eff}(x_i,x_j) = \mathbb{E}_{p(f_1|...)}[\,k_2(f_1(x_i), f_1(x_j))\,]$

Explicit formulas for squared-exponential and cosine compositions embed uncertainty and adaptivity to the input domain, yielding superior generalization and uncertainty quantification (Lu et al., 2020).

Deep Multi-Fidelity Gaussian Processes with Latent Warping: A feed-forward neural network maps input variables into a latent space, upon which an AR(1)-style co-kriging prior is placed, flexibly modeling discontinuous cross-fidelity mappings:

$[f_1(h), f_2(h)] \sim \mathcal{GP}(0, K(h,h'))$

with $K(h,h')$ composed of nonlinear kernels. Backpropagation is used for hyperparameter optimization (Raissi et al., 2016).

3. Active Learning, Acquisition, and Sample Efficiency

Multi-fidelity frameworks frequently interleave model fitting with intelligent data acquisition to maximize information gain per unit cost.

Mutual-Information Acquisition: Deep Multi-Fidelity Active Learning (DMFAL) and DNN-MFBO use mutual information-based acquisition functions, balancing gain-per-cost:

$a(x,m) = \frac{1}{\lambda_m} I\left[ y_m(x); y_M(x) \mid D \right ]$

Exploiting multivariate Delta’s method, moment-matching, and Weinstein-Aronszajn identities, posterior and entropy computations scale efficiently to high output dimensions (Li et al., 2020, Li et al., 2020).

Disentangled Latent Information Gain: D-MFDAL builds flexible neural-process surrogates at each fidelity, using closed-form Bayesian aggregation and latent information gain as its acquisition metric:

$a(x,k) = \frac{1}{c_k} \mathbb{E}_{p(y_k(x))} \bigg[ \mathrm{KL}(p(z_F \mid y_k(x)) || p(z_F)) \bigg ]$

Disentanglement avoids error propagation and enhances OOD sample efficiency (Wu et al., 2023).

Conformal Uncertainty Band Construction: In multi-fidelity autoencoder frameworks, uncertainty bands are derived using multi-split conformal prediction, ensuring $>95\%$ pointwise empirical coverage even under severe data scarcity. Median aggregation across randomized calibration splits yields robust, spatially adaptive intervals (Nieto-Centenero et al., 15 Dec 2025).

4. Reinforcement Learning and Control with Multi-Fidelity Surrogates

Deep RL frameworks incorporate multi-fidelity via sequential transfer, hybrid surrogate environments, or spectrum-aware reward functions.

Sequential Controlled Transfer Learning (CTL): PPO-trained policy networks are first trained in cheap low-fidelity simulators, transferred when the variance-ratio $\beta_e$ of episodic rewards drops below a threshold. HF phase uses the inherited policy for further refinement, saving $>30\%$ in expensive queries with no loss of optimization performance (Bhola et al., 2022).
Hybrid Differentiable Models for Control: Complex dynamical or chaotic systems are controlled by RL agents interacting with a learned correction to a low-fidelity solver. The surrogate update is

$u_{t+1}^{C} = \hat F^L(u_t, a_t) + f_\theta(u_t^{C}, \hat F^L(u_t, a_t), a_t)$

The correction NN is supervised on HF data and used in RL loops, with spectrum-based rewards to match HF statistical characteristics (Sun et al., 8 Apr 2025).

5. Structured Scientific Surrogate Applications

Domain-specific frameworks employ multi-fidelity structures to solve demanding industrial and scientific problems.

Aerodynamic Regression (DNN, Autoencoder, and Conformal): DNN and autoencoder-based models fuse experimental and CFD data to reconstruct surface pressure distributions, exploiting weighting strategies ( $\rho$ ) and transfer learning, leading to significant improvement in interpolation, extrapolation, and uncertainty calibration compared to single-fidelity surrogates (Li et al., 2021, Nieto-Centenero et al., 15 Dec 2025).
Physics-Informed and Transfer-Learning PINNs: MF-PIDNN loads approximate physics into early NN layers via physics-informed loss, then fine-tunes final layers on scarce HF data, maintaining physically correct trends and enabling extrapolation with few HF points (Chakraborty, 2020).
Wind Farm Wake Modeling (Transfer Learning): WakeNet pre-trains on analytic Gaussian-model wakes, then fine-tunes prediction heads on moderate Curl/CFD datasets, achieving $>99\%$ pixel-wise accuracy and matching state-of-the-art farm-level optimization at two orders-of-magnitude speed-up (Anagnostopoulos et al., 2023).
Interatomic Potentials with GNNs: Equivariant GNNs fuse low- and high-fidelity DFT atomic energies via fidelity-specific encoding and weighted losses. They attain high-fidelity PES predictions with minimal HF data, outperforming both transfer- and $\Delta$ -learning on benchmarks like Li₆PS₅Cl and InₓGa₁₋ₓN alloy mixing (Kim et al., 12 Sep 2024).
Graph U-Net for FEM and CFD: Multi-Fidelity Graph U-Net couples node and edge features across differently resolved graphs, with bidirectional k-NN projection for hierarchical information sharing. It surpasses transfer-learning and single-fidelity GNNs in error reduction at constant parameter budgets (Gladstone et al., 19 Dec 2024).

6. Limitations, Generalization, and Extensions

Current multi-fidelity deep learning frameworks reveal both methodological boundaries and opportunities for continued advance.

Interpolation and Extrapolation: Most architectures are fundamentally interpolative in input space; large extrapolations remain challenging. Fine-tuning and residual correction strategies mitigate but do not eliminate these limitations (Li et al., 2021, Nieto-Centenero et al., 15 Dec 2025).
Hyperparameter Selection: Weighting coefficients (e.g. $\rho$ in DNN fusion or fidelity weights in GNN training) generally require small HF validation sets for selection.
Bias and Unimodality: Unmodeled bias in LF sources can mislead fusion strategies unless weights are very small or corrections are explicitly learned.
Model Extensions: Mechanisms to include physics-informed loss, automatic Bayesian selection of fidelity weights, extension to multi-physics (aeroelasticity, materials), and scalable sparse GP approximations are active research directions (Li et al., 2021, Lu et al., 2020, Kim et al., 12 Sep 2024).
Active Learning Integration: Combining multi-fidelity active learning criteria (mutual information, acquisition costs, latent gain) with surrogate construction enables data-efficient exploration of complex simulation or design spaces (Li et al., 2020, Wu et al., 2023).
Domain Expansion: Application to climate modeling, turbulence generation, multi-scale PDE surrogates, and real-time industrial optimization is ongoing, often using hybridized architectures and conformal calibration for uncertainty quantification (Niu et al., 29 Feb 2024, Nieto-Centenero et al., 15 Dec 2025).

7. Comparative Outcomes and Application Areas

Quantitative studies across aerodynamic regression, PDE surrogate inference, particle physics, wind farm optimization, structural mechanics, and interatomic potential estimation consistently document strong improvements for multi-fidelity frameworks over single-fidelity and transfer baselines. These improvements manifest in reduced root-mean-square errors, lower relative error ratios, calibrated uncertainty bands, and significant reductions in expensive simulation or experimental query rates.

Representative outcomes include:

Domain	Multi-fidelity Model	Typical Benchmark Error	Reference
Aerodynamic Cp field	Weighted DNN Fusion	MF-DNN R≈15%, HF-only R≈30%	(Li et al., 2021)
Airfoil regression	AE Transfer + MSCP	RMSE=0.062, R²=0.998	(Nieto-Centenero et al., 15 Dec 2025)
Interatomic potentials	GNN w/ fidelity weights	MAE(E)=5.5 meV/f.u., R²=0.98	(Kim et al., 12 Sep 2024)
FEM/CFD surrogate	Graph U-Net	2D stress error <0.3%	(Gladstone et al., 19 Dec 2024)
RL shape optimization	Sequential CTL	>30% episode cost savings	(Bhola et al., 2022)
PDE surrogate modeling	Residual NP	RMSE improvement ×10	(Niu et al., 29 Feb 2024)
Active learning	DMFAL, DNN-MFBO	10× query cost reduction	(Li et al., 2020, Li et al., 2020)

The widespread success of multi-fidelity deep frameworks in scientific and engineering domains arises from their principled exploitation of data/physical approximation hierarchies, integration of uncertainty quantification, and their adaptive, scalable architectures.