Papers
Topics
Authors
Recent
Search
2000 character limit reached

Physics-Informed GPC for Alloy Design

Updated 18 June 2026
  • Physics-Informed GPC is a Bayesian framework that embeds domain-specific physics knowledge into Gaussian process classification, enhancing constraint satisfaction in alloy discovery.
  • The method leverages tailored priors and hyperparameter optimization via marginal likelihood to reduce posterior uncertainty and guide active learning in high-dimensional design spaces.
  • Empirical case studies demonstrate improved accuracy metrics and reduced experimental costs in alloy design, validating the integration of physics-based models in GPC.

Physics-Informed Gaussian Process Classification (GPC) refers to a Bayesian framework in which the Gaussian Process (GP) prior mean is endowed with explicit, domain-specific physics-based information, enabling the model to capture and enforce constraints directly relevant to materials and alloy design. The approach addresses constraint-satisfaction in alloy discovery by uniting physical models, probabilistic classification, and active learning to efficiently navigate feasible and optimal regions of high-dimensional design spaces (Hardcastle et al., 17 Feb 2025).

1. Mathematical Foundation

The core of physics-informed GPC is the latent function prior:

f(x)GP(m(x),k(x,x))f(x) \sim \mathrm{GP}(m(x), k(x,x'))

where m(x)m(x) is a physics-informed prior mean, and k(x,x)k(x,x') is typically an RBF plus white noise kernel. For a collection of points XX, the vector of latent function values f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T has a multivariate normal prior:

p(fX)=N(fm(X),K(X,X)+σn2I)p(f|X) = \mathcal{N}(f \mid m(X), K(X,X) + \sigma_n^2 I)

Classification is recast as a regression problem using pseudo-targets yn{+5,5}y_n^* \in \{+5, -5\} and Gaussian likelihood:

p(yf)=nN(ynf(xn),σn2)p(y^* | f) = \prod_n \mathcal{N}(y^*_n \mid f(x_n), \sigma_n^2)

Posterior inference proceeds as in standard GP regression. Letting μp\mu_p and σp2\sigma_p^2 denote the posterior mean and variance at test point m(x)m(x)0,

m(x)m(x)1

m(x)m(x)2

The predicted classification probability is recovered using the sigmoid transformation:

m(x)m(x)3

2. Physics-Informed Prior Encoding

The specification of the prior mean m(x)m(x)4 is the central mechanism for integrating physics-based constraints:

Case Study 1 (CALPHAD-based Phase Stability):

For a given alloy composition m(x)m(x)5, equilibrium phase fractions m(x)m(x)6, etc., are computed via Thermo-Calc. These are mapped to prior class probabilities m(x)m(x)7, e.g., m(x)m(x)8 if m(x)m(x)9. For each one-vs-rest binary classifier, the prior mean is k(x,x)k(x,x')0.

Case Study 2 (Valence Electron Concentration):

The Valence Electron Concentration is evaluated as k(x,x)k(x,x')1, and discrete prior probabilities (Table 4 in (Hardcastle et al., 17 Feb 2025)) are assigned, e.g., k(x,x)k(x,x')2, k(x,x)k(x,x')3, k(x,x)k(x,x')4, leading to k(x,x)k(x,x')5.

Case Study 3 (Yield-Strength Constraint):

The Maresca–Curtin model, evaluated at k(x,x)k(x,x')6C, provides k(x,x)k(x,x')7, directly used as the prior mean in regression for predicting high-temperature yield strength k(x,x)k(x,x')8, i.e., k(x,x)k(x,x')9.

3. Hyperparameter Optimization

Kernel hyperparameters XX0 and free parameters in XX1 are optimized by maximizing the marginal likelihood:

XX2

Gradients XX3 are computed via standard GPR identities, and optimization is performed using L-BFGS-B over 10–50 random restarts (Hardcastle et al., 17 Feb 2025). This guarantees that model flexibility and prior structure jointly fit observed pseudo-classification data.

4. Active Learning and Constraint Incorporation

Information-efficient exploration is achieved by Shannon entropy-based acquisition:

XX4

The next candidate for acquisition is

XX5

For categorical constraints (Case 2):

A one-vs-rest GPC ensemble is constructed, and entropy of the predicted probability vector XX6 drives experimental campaign selection, focusing on high-uncertainty (e.g., phase boundary) regions.

For continuous threshold constraints (Case 3):

Given predictive normal distribution XX7, the probability of exceeding the threshold XX8 is

XX9

This binary-classification proxy is used in the entropy criterion above.

5. Empirical Results Across Case Studies

Three distinct application studies anchor the framework:

Case Constraint Type Prior Model Performance Impact
1 Phase stability, static CALPHAD Median accuracy f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T0 vs. f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T1–f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T2 for controls; AUC up f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T3, tighter recall, F1, Brier loss [Fig 4]
2 Categorical, active VEC rule Physics-informed AL converges in f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T4 iterations vs f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T5 for vanilla; f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T6–f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T7 higher early accuracy; lower f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T8(accuracy) [Fig 5]
3 Continuous threshold, active Maresca–Curtin model Recall f=[f(x1),...,f(xn)]Tf = [f(x_1), ..., f(x_n)]^T9 vs p(fX)=N(fm(X),K(X,X)+σn2I)p(f|X) = \mathcal{N}(f \mid m(X), K(X,X) + \sigma_n^2 I)0 for vanilla; Brier loss down by p(fX)=N(fm(X),K(X,X)+σn2I)p(f|X) = \mathcal{N}(f \mid m(X), K(X,X) + \sigma_n^2 I)1; log-loss down by p(fX)=N(fm(X),K(X,X)+σn2I)p(f|X) = \mathcal{N}(f \mid m(X), K(X,X) + \sigma_n^2 I)2 in first p(fX)=N(fm(X),K(X,X)+σn2I)p(f|X) = \mathcal{N}(f \mid m(X), K(X,X) + \sigma_n^2 I)3 iterations [Fig 6,7]

In all scenarios, the introduction of physics-based p(fX)=N(fm(X),K(X,X)+σn2I)p(f|X) = \mathcal{N}(f \mid m(X), K(X,X) + \sigma_n^2 I)4 enhanced both predictive accuracy and sample efficiency, especially in data-scarce regimes or with expensive experimental endpoints (e.g., XRD, mechanical tests) (Hardcastle et al., 17 Feb 2025).

6. Functional Advantages and Limitations

Physics-informed priors sharply reduce posterior uncertainty where domain knowledge is robust, improving model calibration and recall. This leads to fewer high-cost experiments needed to map feasible regions or constraints. Across recall, accuracy, F1, Brier loss, and log-loss, models with p(fX)=N(fm(X),K(X,X)+σn2I)p(f|X) = \mathcal{N}(f \mid m(X), K(X,X) + \sigma_n^2 I)5 outperform both purely statistical (vanilla GPC, uniform prior) and heuristic (pure CALPHAD) baselines in all tested alloy-design scenarios. Notably, the surrogate regression formulation simplifies implementation but introduces a Gaussian likelihood approximation, differing from full Laplace/EP GPC inference methods.

A reliance on sufficiently accurate physical models (e.g., CALPHAD, VEC) is a limitation: biased priors can mislead the classifier. The post hoc normalization in the one-vs-rest ensemble may impair multiclass probabilistic calibration. Future extensions include replacing the surrogate with true non-Gaussian inference and extending the framework to multi-constraint, multi-objective Bayesian optimization by chaining constraint classifiers with physics-informed GP regressors for objectives.

7. Outlook and Broader Implications

Physics-Informed GPC formalizes a unification of mechanistic modeling, Bayesian classification, and sample-efficient exploration for constraint-driven scientific discovery. By explicitly embedding physics-based approximations into the GP mean, the approach achieves robust extrapolation, improved active learning navigation, and substantial cost reductions for experimental alloy design. A plausible implication is enhanced design efficiency in any setting where feasible regions are expensive to probe and credible physical models exist. Continued research may enable hierarchical or compositional priors, full multiclass calibration, and broad application to multi-objective optimization in complex scientific domains (Hardcastle et al., 17 Feb 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Physics-Informed Gaussian Process Classification (GPC).