PDE-Net: Learning PDEs from Data (1710.09668v2)

Published 26 Oct 2017 in math.NA, cs.LG, cs.NE, and stat.ML

Abstract: In this paper, we present an initial attempt to learn evolution PDEs from data. Inspired by the latest development of neural network designs in deep learning, we propose a new feed-forward deep network, called PDE-Net, to fulfill two objectives at the same time: to accurately predict dynamics of complex systems and to uncover the underlying hidden PDE models. The basic idea of the proposed PDE-Net is to learn differential operators by learning convolution kernels (filters), and apply neural networks or other machine learning methods to approximate the unknown nonlinear responses. Comparing with existing approaches, which either assume the form of the nonlinear response is known or fix certain finite difference approximations of differential operators, our approach has the most flexibility by learning both differential operators and the nonlinear responses. A special feature of the proposed PDE-Net is that all filters are properly constrained, which enables us to easily identify the governing PDE models while still maintaining the expressive and predictive power of the network. These constrains are carefully designed by fully exploiting the relation between the orders of differential operators and the orders of sum rules of filters (an important concept originated from wavelet theory). We also discuss relations of the PDE-Net with some existing networks in computer vision such as Network-In-Network (NIN) and Residual Neural Network (ResNet). Numerical experiments show that the PDE-Net has the potential to uncover the hidden PDE of the observed dynamics, and predict the dynamical behavior for a relatively long time, even in a noisy environment.

Citations (718)

View on Semantic Scholar

Summary

The paper introduces a novel deep network architecture that learns both differential operators and nonlinear responses from observational data.
It employs convolution kernels with theoretical constraints to accurately approximate differential operators and enhance long-term prediction accuracy.
Experimental results validate the model's ability to identify underlying PDE structures, offering a transparent approach to complex dynamical systems.

Learning PDEs from Data Using PDE-Net

Partial Differential Equations (PDEs) are foundational tools in the modeling of various phenomena across a plethora of scientific domains. Traditionally, these equations are derived based on fundamental physical principles or empirical observations, as exemplified by the Navier-Stokes equations for fluid dynamics and Maxwell's equations for electromagnetism. However, governing equations for complex systems in modern applications, such as certain climate models and neuroscience problems, often remain incompletely characterized. With the proliferation of sensor technology and data storage capabilities, there is unprecedented access to vast datasets, which creates opportunities for data-driven discovery of underlying physical laws.

Paper Overview

In this paper, titled "PDE-Net: Learning PDEs from Data," Long, Lu, Ma, and Dong propose a novel feed-forward deep learning architecture termed PDE-Net. This deep network addresses the dual objectives of discovering hidden PDE models from observed data and predicting future dynamical behavior. PDE-Net leverages advances in neural network design to differentiate itself from prior approaches, which often incorporate fixed numerical approximations or assume knowledge of certain parameters within the PDEs.

The PDE-Net distinguishes itself by its capability to learn both differential operators and nonlinear responses simultaneously. This flexibility is underscored by the network's structural constraints, which are derived from the theoretical understanding of relations between filter sum rules and differential operator orders. These constraints ensure the network can accurately capture the governing PDE.

Technical Contributions

Differential Operator Learning: PDE-Net uses convolution kernels to learn differential operators. The network imposes specific constraints on these kernels that align them with the corresponding differential operators, leveraging the concept of wavelet sum rules.
Nonlinear Response Approximation: The nonlinear response functions within the underlying PDEs are approximated using neural networks. This approach bypasses the need to pre-specify the form of these functions, unlike some existing methods.
Deep Network Implementation: PDE-Net integrates multiple layers (termed Δt-blocks), each designed to predict incremental changes over time. This structure enhances long-term predictive stability and enables robust dynamical behavior forecasting.
Relation to Existing Networks: The architecture of PDE-Net shows parallels with Network-In-Network (NIN) and Residual Neural Networks (ResNet), especially in leveraging multiple layers and incorporating point-wise neural networks within each Δt-block.

Experimental Results and Implications

The authors validate PDE-Net through numerical experiments on both linear convection-diffusion equations with variable coefficients and nonlinear diffusion equations with a nonlinear source term. Key findings include:

Predictive Performance: The PDE-Net demonstrates superior long-term predictive capabilities, even in the presence of noisy data. The network's stability and accuracy improve with larger filter sizes and deeper network structures.
Governance Identification: PDE-Net successfully identifies the underlying PDE models, including the specific form of linear and nonlinear terms, thereby offering a transparent view of the inferred physical laws.
Comparison with Existing Methods: The experiments highlight the advantages of learnable, constrained filters over fixed numerical approximations (Frozen-PDE-Net), as well as fully unconstrained approaches (Freed-PDE-Net).

Future Directions

The practical implications of PDE-Net are profound, with potential applications extending to real-world datasets where the physical system may not be fully understood or easily observable. Examples include data assimilation in climate modeling, where hidden variables play a critical role, and dynamic systems in finance, where underlying PDEs govern market behaviors. Additionally, the framework can be adapted to design stable and consistent numerical schemes for specific PDE models.

This paper lays the groundwork for future research in data-driven PDE discovery, providing a versatile tool that bridges concepts from deep learning and applied mathematics. Further investigations might include extending PDE-Net to accommodate higher-dimensional systems, integrating domain-specific knowledge to refine network architectures, and exploring hybrid methods that combine data-driven and traditional physics-based approaches.

In conclusion, PDE-Net represents a significant advancement in the ongoing effort to leverage deep learning for scientific discovery, offering both practical flexibility and theoretical robustness in learning PDEs from observed data.

PDF Markdown