Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

131 tokens/sec

GPT-4o

10 tokens/sec

Gemini 2.5 Pro Pro

47 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

38 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

Deep Grey-Box Modeling

Updated 1 July 2025

Deep grey-box modeling is a hybrid approach integrating known physics with deep learning to complete or correct imperfect physical models using limited and unpaired data.
This method leverages Optimal Transport to learn minimal corrections to physics models from unpaired data, enhancing accuracy while preserving physical interpretability.
Deep grey-box modeling provides improved accuracy and parameter adherence over black-box models on physics tasks, offering interpretable insight into corrections.

Deep grey-box modelling denotes a class of hybrid modeling strategies for physical systems where the governing equations (physics) are only partially known or are approximate, integrating domain knowledge with deep learning in a principled, generative fashion. In the setting examined in the cited paper, this approach addresses the challenge of “completing” or correcting imperfect ODE/PDE-based models to align with real-world phenomena when limited and unpaired data are available, while preserving transparency and interpretability via physics-based inductive biases.

1. Hybrid Generative Framework: Structure and Integration

The hybrid generative approach combines a known (though possibly incomplete) physics model, $f_p$ , with a deep neural network predictor, $f_\omega$ , to construct a composite generator $T(f_p, f_\omega; x, \theta, z)$ that maps initial conditions $x$ , system parameters $\theta$ , and (optionally) noise $z$ to trajectories $y$ in data space: $T(f_p, f_\omega; x, \theta, z) = \text{ODESolve} \left( \frac{dy(t)}{dt} = f_\omega(y(t), \theta, z) \circ f_p(y(t), \theta) \mid y(0) = x(0) \right)$ This construction allows for both deterministic (one-to-one) and stochastic (one-to-many) mappings.

A core innovation is the use of Optimal Transport (OT) for learning a mapping from simulations obtained from the imperfect physics ( $p(x)$ ) to the actual data-generating process ( $v(y)$ ), even when the datasets are unpaired. The conditional OT mapping learns to minimally warp the source (physics) distribution such that output statistics match the observed data, subject to the constraint that the physics-based generative process is only “completed” or corrected where necessary.

The optimization problem for the OT-enhanced generator leverages a weak-OT kernel cost: $W_{k,\gamma}(p, v) = \sup_f \inf_{T,\, \nu\in P(Z)}\, \mathbb{E}_{x\sim p,\, z\sim \nu} \left[ C_{k,\gamma}(x, T_\omega(x, \theta, z)) \right] - \mathbb{E}_{y\sim v} [f(y)]$ where $C_{k,\gamma}$ (Eq. 4) is a RKHS-based weak quadratic cost, and the adversarial optimization alternates between training the generator ( $T_\omega$ ) and the critic ( $f_\omega$ ).

2. Physics-Based Inductive Bias and Model Completion

The physics-based inductive bias is realized by always including the output of the analytic physics model ( $f_p$ ) in the generator architecture. The neural network component, $f_\omega$ , only learns residual corrections to the known dynamics, not the overall system behavior, focusing model capacity and preserving consistency with physical law.

The mapping is conditioned on the system parameters $\theta$ , ensuring the correct dependence of generated trajectories on physical quantities and enabling physically meaningful parameter inference. The model leverages OT to ensure that corrections to the physics model are minimal and well-localized, only modifying the simulation where necessitated by divergence from observed data.

Concrete examples in the paper demonstrate this process in:

Learning the missing friction term in a pendulum ODE where the simulator omits friction ( $f_p$ is frictionless).
Completing the missing reaction or advection term in reaction-diffusion and advection-diffusion PDEs. The architecture thus enables interpretable corrections that can be analyzed and, if desired, incorporated back into improved physics-based simulators.

3. Performance Analysis and Model Transparency

Empirical studies demonstrate:

On deterministic (one-to-one) mappings (e.g., ODE and PDE cases), the deep grey-box/OT framework achieves lower normalized RMSE compared to black-box models.
In stochastic (one-to-many) mapping tasks, the OT-based hybrid models display improved sample diversity and better Maximum Mean Discrepancy (MMD) and C2ST scores.

Critically, these models ensure more faithful adherence to physical parameterization and avoid the “parameter bypass” common in pure neural approaches, where models match output statistics but disregard underlying parameter relationships. Grey-box models enable granular inspection of both the physics model and the learned correction, facilitating scientific understanding, diagnostics, and trust—a distinction from black-box generative models.

Table: Summary—Grey-Box vs. Black-Box Performance

Aspect	Grey-Box Hybrid Model	Black-Box Model
Physics consistency	High	Often poor
Interpretability	High (component analysis)	Low
Data efficiency	High	Lower
Generation fidelity	High on conditional tasks	Often lower
Handles unpaired data	Yes (via OT)	Yes, less robust

4. Application Domains and Implications

Hybrid generative models are especially pertinent for scientific and engineering scenarios where:

Physics-based models (ODEs or PDEs) are available but incomplete, omitting key phenomena (e.g., friction, advection, or interaction terms).
There exists limited and unpaired observational data, which rules out classic supervised calibration or supervised generative modeling.
Robust parameter inference and scientific interpretability of corrections are critical, as in engineering diagnostics, geophysical modeling, material science, or process control.

The framework provides avenues for:

Model completion and uncertainty quantification in domains where simulators are approximate.
Improved parameter estimation and inverse problems constrained by (possibly partial) physical knowledge.
Trustworthy integration of machine learning with scientific computing, highlighting components where the model’s correction diverges from expectation.

This approach also facilitates future research into scalable, interpretable, and sample-efficient AI for physical sciences, and provides practical mechanisms for transparent model improvement in digital twin applications.

5. Mathematical and Algorithmic Details

Key mathematical contributions in the work include:

The conditional OT cost using a reproducing kernel Hilbert space:

$C_{k,\gamma}(x, v) = k(x, x) + \int k(y, y) dv(y) - 2\int k(x, y) dv(y)$

Generative model update loop combining adversarial optimization and gradient penalty enforcement (Eq. 19).
Hybrid ODE/PDE simulation embedded in an end-to-end trainable generative model:

$\frac{dy}{dt} = f_\omega(y(t), \theta, z) \circ f_p(y(t), \theta), \quad y(0) = x(0)$

These formulations underpin the ability to learn from unpaired data while embedding physical constraints directly in the generative process.

6. Future Directions and Open Challenges

The methodology introduces a template for bridging deep learning and physics-based simulation in scenarios afflicted by model deficit and limited paired data. Future research could pursue:

Extensions to broader classes of PDEs/ODEs, higher-dimensional and multiphysics systems.
Adaptive or hierarchical schemes where the division of known and learned components evolves during training.
Integration with uncertainty quantification, enabling risk-assessed decision support in engineering and scientific modeling.
Deployment in industrial digital twins and real-world scientific discovery tasks where trust, interpretability, and error diagnosis are essential.

This approach advances the state of trustworthy, interpretable AI for scientific generative tasks where only partial system knowledge is available, while remaining robust to the challenges of limited, unpaired observational data.

PDF Markdown Chat (Upgrade)