Deep Hidden Physics Models: Deep Learning of Nonlinear Partial Differential Equations (1801.06637v1)

Published 20 Jan 2018 in stat.ML, cs.LG, cs.NA, and math.AP

Abstract: A long-standing problem at the interface of artificial intelligence and applied mathematics is to devise an algorithm capable of achieving human level or even superhuman proficiency in transforming observed data into predictive mathematical models of the physical world. In the current era of abundance of data and advanced machine learning capabilities, the natural question arises: How can we automatically uncover the underlying laws of physics from high-dimensional data generated from experiments? In this work, we put forth a deep learning approach for discovering nonlinear partial differential equations from scattered and potentially noisy observations in space and time. Specifically, we approximate the unknown solution as well as the nonlinear dynamics by two deep neural networks. The first network acts as a prior on the unknown solution and essentially enables us to avoid numerical differentiations which are inherently ill-conditioned and unstable. The second network represents the nonlinear dynamics and helps us distill the mechanisms that govern the evolution of a given spatiotemporal data-set. We test the effectiveness of our approach for several benchmark problems spanning a number of scientific domains and demonstrate how the proposed framework can help us accurately learn the underlying dynamics and forecast future states of the system. In particular, we study the Burgers', Korteweg-de Vries (KdV), Kuramoto-Sivashinsky, nonlinear Schr\"{o}dinger, and Navier-Stokes equations.

Citations (714)

View on Semantic Scholar

Summary

The paper introduces a dual neural network framework that infers underlying nonlinear PDEs from noisy data without relying on unstable numerical differentiation.
It validates the approach on benchmark equations such as Burgers', KdV, and Navier-Stokes, achieving low relative L2 errors down to 4.78e-3.
The method eliminates explicit derivative approximations, offering a scalable and robust tool for modeling complex physical systems.

Deep Hidden Physics Models: Deep Learning of Nonlinear Partial Differential Equations

Maziar Raissi presents a methodology that bridges the fields of artificial intelligence and applied mathematics by leveraging deep learning to infer complex nonlinear partial differential equations (PDEs) from high-dimensional data. This paper addresses a fundamental challenge: formulating predictive mathematical models of physical phenomena directly from scattered and noisy data, bypassing laborious numerical differentiation treatments.

Summary of Methodology

The methodology proposed by Raissi employs two deep neural networks to approximate both the unknown solution and the nonlinear dynamics of the PDE. The first network acts as a prior on the solution, facilitating the avoidance of unstable numerical differentiation. The second network encapsulates the nonlinear dynamics, thus identifying the governing laws of spatiotemporal data. The derivative computation of the networks' outputs is achieved via automatic differentiation, ensuring precision without introducing numerical errors.

To validate this approach, the author tests the framework on benchmark equations known for their complexity and relevance in scientific and engineering domains: Burgers', Korteweg-de Vries (KdV), Kuramoto-Sivashinsky, nonlinear Schrödinger, and Navier-Stokes equations. The results indicate the algorithm's capability in accurately discovering PDEs and predicting future system states.

Key Results and Implications

Burgers' Equation

For the Burgers' equation, commonly arising in fluid mechanics and other applied fields, the algorithm demonstrated a relative $L^2$ -error of $4.78 \times 10^{-3}$ . Despite training on data collected from a limited portion of the spatiotemporal domain, the method effectively extrapolated to unobserved time intervals.

KdV Equation

Applied to the KdV equation, the method achieved a relative $L^2$ -error of $6.28 \times 10^{-2}$ , affirming its capability in handling wave propagation phenomena with third-order spatial derivatives.

Kuramoto-Sivashinsky Equation

In the context of spatiotemporal chaotic systems modeled by the Kuramoto-Sivashinsky equation, the framework approximated the underlying dynamics with a relative $L^2$ -error of $7.63 \times 10^{-2}$ . The approach demonstrated robustness against system complexity and chaos.

Nonlinear Schrödinger Equation

The nonlinear Schrödinger equation, pertinent to fields such as nonlinear optics and quantum mechanics, posed an additional challenge due to its complex-valued solutions. The model maintained accuracy, producing a relative $L^2$ -error of $6.28 \times 10^{-3}$ .

Navier-Stokes Equation

For the two-dimensional Navier-Stokes equations modeling fluid flow, the approach effectively learned the underlying dynamics from sparse and noisy training data, achieving a relative $L^2$ -error of $5.79 \times 10^{-3}$ .

Practical and Theoretical Implications

The proposed framework provides a versatile and efficient tool for the discovery of PDEs governing complex dynamics across various scientific fields. The main advantages include the ability to handle noise and sparse data, scalability to high-dimensional systems, and the elimination of the need for derivative approximations. This has significant practical implications in fields where collecting large datasets is expensive or infeasible.

On a theoretical front, this work paves the way for advancements in data-driven scientific discovery, enabling the formulation of governing equations for systems previously resistant to traditional modeling approaches. The integration of physics-informed priors and modern machine learning techniques has the potential to reshape methodologies in dynamic system identification and predictive modeling.

Future Directions

Future research could explore several extensions, such as the adaptation of convolutional neural networks for high-dimensional PDEs in dynamic programming and optimal control contexts. Additionally, parameterized PDEs with bifurcation behavior (e.g., dependence on Reynolds number) present an intriguing avenue for extending this framework. Real-world applications may also benefit from investigating time-delay coordinate embeddings, potentially broadening the scope of observable dynamical systems.

Overall, Raissi's work represents a significant methodological advancement in the automatic discovery of physical laws, with broad applications and numerous opportunities for further investigation and refinement.

PDF Markdown