Physics-informed features in supervised machine learning (2504.17112v1)

Published 23 Apr 2025 in stat.ML, cs.LG, cs.NA, and math.NA

Abstract: Supervised machine learning involves approximating an unknown functional relationship from a limited dataset of features and corresponding labels. The classical approach to feature-based machine learning typically relies on applying linear regression to standardized features, without considering their physical meaning. This may limit model explainability, particularly in scientific applications. This study proposes a physics-informed approach to feature-based machine learning that constructs non-linear feature maps informed by physical laws and dimensional analysis. These maps enhance model interpretability and, when physical laws are unknown, allow for the identification of relevant mechanisms through feature ranking. The method aims to improve both predictive performance in regression tasks and classification skill scores by integrating domain knowledge into the learning process, while also enabling the potential discovery of new physical equations within the context of explainable machine learning.

Summary

The paper introduces physics-informed features (PIFs) that encode physical dimensions into new feature combinations to enhance predictive models.
It proposes constructing and standardizing PIFs via dimensional analysis and domain expertise, then applying ridge regression to improve prediction accuracy.
Experiments on synthetic and real datasets demonstrate robust performance improvements and the recovery of known physical laws.

This paper (2504.17112) introduces a novel approach to supervised machine learning that incorporates domain knowledge by constructing "physics-informed features" (PIFs). The core idea is to move beyond standard data preprocessing techniques, such as feature standardization, which disregard the physical meaning, dimensions, and units of the input features. Instead, the proposed method creates new, non-linear feature combinations based on physical principles and dimensional analysis. These PIFs are designed to share the same physical dimension as the target variable (label).

The operational workflow of the proposed physics-informed approach contrasts with the classical approach:

Classical Approach:
- Original features and labels are collected.
- Features are standardized (scaled and made dimensionless).
- A machine learning model (e.g., linear regression) is trained on the standardized features.
- Predictions are made on new, standardized features.
Physics-Informed Approach:
- Original features and labels are collected, noting their physical dimensions and units.
- Physics-Informed Features (PIFs) are constructed by combining original features using physical laws, dimensional analysis, or domain expertise (see Algorithm 3.1). All PIFs must have the same physical dimension as the label.
- The generated PIFs are standardized to create Standardized Physics-Informed Features (SPIFs).
- A machine learning model (e.g., ridge regression) is trained using the SPIFs and the corresponding labels.
- Predictions are made using new, standardized PIFs derived from new original features.
- Optionally, feature ranking is applied to the SPIFs to identify the most influential PIFs for the prediction task (see Algorithm 3.2). The regression coefficients of the ranked SPIFs can be de-standardized to infer the physical coefficients of the underlying model equation.

The theoretical foundation connects this approach to Reproducing Kernel Hilbert Spaces (RKHS) and inverse problems. Constructing the PIF map $\phi(x)$ can be viewed as defining a linear operator $A$ . Solving the supervised learning problem in the space of PIFs is mathematically equivalent to solving a regularized inverse problem associated with this physics-informed operator $A$ . This connection is illustrated in Figure 1 of the paper (2504.17112), showing the transformation from measured features to a physics-informed operator, solving the inverse problem, and mapping the solution back to the predicted output.

Implementation Details:

PIF Construction (Algorithm 3.1): This is a critical step requiring domain knowledge. Given original features $\{F_1, \ldots, F_m\}$ with units $\{u_1, \ldots, u_m\}$ and a label $Y$ with unit $u_Y$ , PIFs $\{\text{PIF}_1, \ldots, \text{PIF}_p\}$ are generated such that each $\text{PIF}_j$ has the unit $u_Y$ . This can involve polynomial combinations ( $F_i^a F_k^b$ ), ratios ( $F_i/F_k$ ), or more complex forms guided by known physics.
Standardization: Similar to the classical approach, PIFs are standardized (SPIFs) before training. This helps with numerical stability, especially for algorithms sensitive to feature scaling, like ridge regression.
Machine Learning Model: The paper primarily uses ridge regression, demonstrating that standard linear models can leverage non-linear relationships through the PIF transformation. The authors mention that results were robust with other methods like Support Vector Machines.
Feature Ranking (Algorithm 3.2): A greedy approach is used. SPIFs are initially ranked by the magnitude of their learned regression coefficients. The model is then iteratively trained using progressively more SPIFs based on this ranking, monitoring performance metrics (MAE/MSE for regression, skill scores for classification). The set of PIFs that yields saturated performance is considered the most relevant subset.
De-standardization: After identifying the most influential SPIFs and their coefficients, these coefficients are de-standardized using the mean and standard deviation calculated during the SPIF generation step. This process recovers coefficients that correspond to the original PIFs, allowing for the identification of the potential physical relationship $Y \approx \sum \beta_j \text{PIF}_j$ .

Practical Applications and Experiments:

The paper demonstrates the approach using three synthetic experiments and one real-world application:

Synthetic Fluid Dynamics (Bernoulli Equation):
- Problem: Predict a constant value related to pressure in fluid flow based on features like density, velocity, height, flow rate, area, viscosity.
- PIF Construction: PIFs were generated with the dimension of pressure (Pa), including terms corresponding to static pressure, dynamic pressure, and hydrostatic pressure found in the Bernoulli equation, plus other dimensionally consistent terms (Table 1 (2504.17112)).
- Results: Using SPIFs significantly reduced MAE and MSE compared to SFs across different noise levels (Table 2, Figure 2 (2504.17112)). Feature ranking successfully identified the three PIFs corresponding to the Bernoulli equation terms as the most important, and their de-standardized coefficients accurately approximated the expected physical coefficients (Table 3, Figure 3 (2504.17112)).
- Implication: Demonstrates improved predictive accuracy and the ability to recover known physical laws.
Synthetic Pulsar Magnetic Dissipation:
- Problem: Predict the magnetic energy dissipation rate of a pulsar based on features like radius, magnetic field, angular velocity, inclination angle, period, mass, moment of inertia, rotational energy.
- PIF Construction: PIFs were generated with the dimension of power (W), including a term corresponding to the known magnetic dissipation law (Table 4 (2504.17112)).
- Results: Training with SPIFs resulted in better prediction performance (lower errors) than with SFs. The experiment also showed that performance is significantly better when the PIF corresponding to the true physical law is included in the training set of SPIFs, highlighting the value of encoding known physics if available (Figure 4 (2504.17112)).
- Implication: Shows robustness and the importance of incorporating known physics into PIF design.
Synthetic Binary System Classification:
- Problem: Classify whether a binary system is gravitationally bound (label 1) or unbound (label 0) based on masses, relative velocity, and distance.
- PIF Construction: PIFs were generated with the dimension of energy (J), including terms related to kinetic and potential energy (Table 5 (2504.17112)). The label is determined by the sign of the total energy (PIF $_1$ + PIF $_2$ ).
- Results: Classification using SPIFs yielded significantly better skill scores (TSS, HSS, Accuracy, Specificity) compared to using SFs (Confusion Matrices and Table 6 (2504.17112)).
- Implication: Extends the benefits of PIFs to classification tasks, showing improved discrimination based on physically meaningful feature combinations.
Real-World Solar Flare Forecasting:
- Problem: Binary classification to forecast solar flares based on features extracted from solar active region magnetograms.
- Original Features: Nine features related to electric current, force, magnetic helicity, magnetic flux, area, energy density, magnetic field strength, magnetic field gradient, and characteristic length (Table 7 (2504.17112)).
- PIF Construction: PIFs were generated with the dimension of energy (TAm $^2$ , a unit related to magnetic energy flux), including various combinations of the original features.
- Results: Training with SPIFs led to improved skill scores (TSS, HSS, Accuracy, Specificity) compared to SFs, although the improvement was less drastic than in synthetic cases (Confusion Matrices and Table 8 (2504.17112)). Feature ranking identified PIF $_2 = \Phi I$ (magnetic flux times electric current) as the most influential PIF.
- Implication: Demonstrates that the method is applicable to complex real-world problems where the underlying physics is not fully known. The feature ranking can point to physically relevant combinations (like $\Phi I$ , related to magnetic helicity, known to be important in flare physics), potentially aiding scientific discovery.

Implementation Considerations:

Domain Expertise: Constructing meaningful PIFs requires significant domain knowledge to identify relevant physical quantities and potential relationships.
Combinatorial Explosion: If the original features have many different dimensions, generating all dimensionally consistent combinations can lead to a very large number of PIFs, increasing computational complexity. Heuristics or prior knowledge are needed to select a reasonable set of PIFs.
Dimensional Analysis Tools: Implementing PIF generation can be aided by software libraries that handle dimensional analysis to check the consistency of feature combinations.
Computational Requirements: Training on a larger set of PIFs (especially if $p > m$ ) might require more computation than training on original features, depending on the chosen ML algorithm. However, standard techniques like ridge regression are efficient.
Feature Ranking Cost: The greedy feature ranking algorithm involves retraining the model multiple times, adding computational cost.
Applicability: While demonstrated for regression and classification with ridge regression, the concept of PIFs can be extended to other supervised learning tasks and models.

In summary, the paper provides a practical framework for integrating physics into feature engineering for supervised ML. By creating dimensionally homogeneous, physics-informed features, the method enhances model interpretability, often improves predictive performance, and offers a mechanism (feature ranking) to potentially identify key physical drivers from data, even when the exact underlying laws are unknown.