Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Integrating Scientific Knowledge with Machine Learning for Engineering and Environmental Systems (2003.04919v6)

Published 10 Mar 2020 in physics.comp-ph, cs.LG, and stat.ML

Abstract: There is a growing consensus that solutions to complex science and engineering problems require novel methodologies that are able to integrate traditional physics-based modeling approaches with state-of-the-art ML techniques. This paper provides a structured overview of such techniques. Application-centric objective areas for which these approaches have been applied are summarized, and then classes of methodologies used to construct physics-guided ML models and hybrid physics-ML frameworks are described. We then provide a taxonomy of these existing techniques, which uncovers knowledge gaps and potential crossovers of methods between disciplines that can serve as ideas for future research.

Citations (339)

Summary

  • The paper introduces four hybrid techniques that integrate physical laws with ML models to achieve accurate and physically consistent predictions.
  • Methodologies such as physics-guided loss, initialization, and architecture design significantly improve convergence and sample efficiency in data-scarce settings.
  • The survey highlights cross-disciplinary innovation, encouraging collaboration between traditional scientific modeling and advanced machine learning.

Integrating Scientific Knowledge with Machine Learning for Engineering and Environmental Systems: An Academic Overview

The paper, titled "Integrating Scientific Knowledge with Machine Learning for Engineering and Environmental Systems," provides a comprehensive survey of methodologies that integrate traditional physics-based modeling with advanced ML techniques to tackle complex problems in engineering and environmental systems. Current trends increasingly highlight the need for a symbiotic approach combining ML's data-driven capabilities with established scientific knowledge to improve predictive performance, ensure physical consistency, and enhance generalizability.

Methodological Overview

The paper categorizes approaches into four primary methods:

  1. Physics-Guided Loss Function: This involves integrating physical laws directly into the ML model's loss function. By penalizing deviations from these laws during training, the ML model is guided towards physically consistent predictions, which is crucial in scenarios with scarce observation data. This approach has shown efficacy in applications like lake temperature modeling and PDE-solving by embedding physical constraints into the optimization process.
  2. Physics-Guided Initialization: Involving pre-training ML models on synthetic data generated by mechanistic models, this method improves models' starting points, leading to faster convergence and better generalization from limited observational data. It leverages transfer learning to mitigate issues related to data scarcity and is particularly beneficial in robotics and autonomous systems training.
  3. Physics-Guided Architecture Design: Custom ML architectures embed physical principles within the model structure, such as invariances and symmetries, enhancing interpretability and ensuring physically meaningful outputs. Notable instances include systems modeling rotational invariance in fluid dynamics and using Hamiltonian mechanics to preserve conservation laws within neural network architectures.
  4. Hybrid Physics-ML Models: These integrate ML components within traditional physics-based models, where ML can replace subcomponents or serve as correctional models predicting residual errors. Such architectures allow for adaptive utilization of both physics-based and data-driven insights, enhancing robustness and predictive reliability.

Implications and Future Directions

The discussed methodologies offer significant implications for enhancing reliability, efficiency, and robustness in modeling complex systems. Particularly, these hybrid and physics-informed models efficiently bridge data-driven insights and scientific domain knowledge, which is pivotal in significantly data-constrained domains like climate science and fluid dynamics.

By stimulating cross-disciplinary research, this survey highlights possibilities for cross-pollination across domains and techniques. The exploration of physics-guided architectures across different application domains remains a forthcoming opportunity to unify understanding and improvements.

The survey lays a foundation for systematically applying ML across traditional scientific fields, encouraging ML practitioners to consider building domain-informed models. Researchers are encouraged to exchange ideas across domains, leveraging methodologies from rapidly advancing AI fields. This work provides a roadmap for integrating empirical and mechanistic modeling approaches, reducing computational demands, and enhancing sample efficiency while retaining physical realism and interpretability.

As the lines between data-driven models and mechanistic approaches blur, future developments are likely to focus on seamless integration across various application-centric objectives, from inverse modeling to downscaling, driving advancements in AI's application in scientific discovery and practical system modeling.