Investigating and Mitigating Failure Modes in Physics-informed Neural Networks (PINNs) (2209.09988v3)
Abstract: This paper explores the difficulties in solving partial differential equations (PDEs) using physics-informed neural networks (PINNs). PINNs use physics as a regularization term in the objective function. However, a drawback of this approach is the requirement for manual hyperparameter tuning, making it impractical in the absence of validation data or prior knowledge of the solution. Our investigations of the loss landscapes and backpropagated gradients in the presence of physics reveal that existing methods produce non-convex loss landscapes that are hard to navigate. Our findings demonstrate that high-order PDEs contaminate backpropagated gradients and hinder convergence. To address these challenges, we introduce a novel method that bypasses the calculation of high-order derivative operators and mitigates the contamination of backpropagated gradients. Consequently, we reduce the dimension of the search space and make learning PDEs with non-smooth solutions feasible. Our method also provides a mechanism to focus on complex regions of the domain. Besides, we present a dual unconstrained formulation based on Lagrange multiplier method to enforce equality constraints on the model's prediction, with adaptive and independent learning rates inspired by adaptive subgradient methods. We apply our approach to solve various linear and non-linear PDEs.
- M. W. M. G. Dissanayake, N. Phan-Thien, Neural-network-based approximations for solving partial differential equations, Commun. Numer. Meth. Eng. 10 (1994) 195–201.
- Neural network differential equation and plasma equilibrium solver, Phys. Rev. Lett. 75 (1995) 3594–3597.
- C. Monterola, C. Saloma, Solving the nonlinear schrodinger equation with an unsupervised neural network, Opt. Express 9 (2001) 72–84.
- M. Hayati, B. Karami, Feedforward neural network for solving partial differential equations, J. Appl. Sci. 7 (2007) 2812–2817.
- Solving N-body problems with neural networks, Physical review letters 86 (2001) 4741.
- Solving differential equations with unsupervised neural networks, Chem. Eng. Process. 42 (2003) 715–721.
- Artificial neural networks for solving ordinary and partial differential equations, IEEE Trans. Neural Netw. 9 (1998) 987–1000.
- W. E, B. Yu, The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems, Commun. Math. Stat. 6 (2018) 1–12. doi:10.1007/s40304-018-0127-z.
- Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys. 378 (2019) 686–707.
- J. Sirignano, K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys. 375 (2018) 1339–1364.
- Deep learning of vortex-induced vibrations, J. Fluid Mech. 861 (2019) 119–137.
- Machine learning in cardiovascular flows modeling: Predicting arterial blood pressure from non-invasive 4D flow MRI data using physics-informed neural networks, Comput. Method. Appl. Mech. Eng. 358 (2020) 112623.
- Physics-informed neural networks for high-speed flows, Computer Methods in Applied Mechanics and Engineering 360 (2020) 112789.
- Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain, Journal of Computational Physics 428 (2021) 110079.
- Thermodynamically consistent physics-informed neural networks for hyperbolic systems, Journal of Computational Physics 449 (2022) 110754. doi:10.1016/j.jcp.2021.110754.
- S. Basir, I. Senocak, Physics and equality constrained artificial neural networks: Application to forward and inverse problems with multi-fidelity data fusion, J. Comput. Phys. (2022) 111301. doi:10.1016/j.jcp.2022.111301.
- Physics-informed neural networks for solving reynolds-averaged navier–stokes equations, Physics of Fluids 34 (2022) 075117. URL: https://doi.org/10.1063/5.0095270. doi:10.1063/5.0095270. arXiv:https://doi.org/10.1063/5.0095270.
- A. D. Jagtap, G. E. Karniadakis, Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations., in: AAAI Spring Symposium: MLPS, 2021.
- S. Basir, I. Senocak, Critical investigation of failure modes in physics-informed neural networks, in: AIAA SCITECH 2022 Forum, 2022, p. 2353.
- Understanding and mitigating gradient flow pathologies in physics-informed neural networks, SIAM Journal on Scientific Computing 43 (2021) A3055–A3081.
- Characterizing possible failure modes in physics-informed neural networks, Advances in Neural Information Processing Systems 34 (2021).
- Optimally weighted loss functions for solving pdes with neural networks, Journal of Computational and Applied Mathematics 405 (2022) 113887.
- D. Liu, Y. Wang, A dual-dimer method for training physics-constrained neural networks with minimax architecture, Neural Networks 136 (2021) 112–125.
- L. McClenny, U. Braga-Neto, Self-adaptive physics-informed neural networks using a soft attention mechanism, arXiv preprint arXiv:2009.04544 (2020).
- M. J. Powell, A method for nonlinear constraints in minimization problems, in: R. Fletcher (Ed.), Optimization; Symposium of the Institute of Mathematics and Its Applications, University of Keele, England, 1968, Academic Press, London,New York, 1969, pp. 283–298.
- D. P. Bertsekas, Multiplier methods: A survey, Automatica 12 (1976) 133–145.
- D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014).
- J. Nocedal, Updating quasi-Newton matrices with limited storage, Math. Comput. 35 (1980) 773–782.
- Pytorch: An imperative style, high-performance deep learning library, Advances in neural information processing systems 32 (2019).
- Visualizing the loss landscape of neural nets, Advances in neural information processing systems 31 (2018).
- X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, in: Y. W. Teh, M. Titterington (Eds.), Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, volume 9 of Proceedings of Machine Learning Research, PMLR, Chia Laguna Resort, Sardinia, Italy, 2010, pp. 249–256.
- Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, in: Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026–1034.
- J. Bo-Nan, C. Chang, Least-squares finite elements for the stokes problem, Computer Methods in Applied Mechanics and Engineering 78 (1990) 297–311.
- S. Ruder, An overview of gradient descent optimization algorithms, arXiv preprint arXiv:1609.04747 (2016).
- J. Baker-Jarvis, R. Inguva, Heat conduction in layered, composite materials, Journal of applied physics 57 (1985) 1569–1573.
- J. T. Oden, O.-P. Jacquotte, Stability of some mixed finite element methods for stokesian flows, Computer methods in applied mechanics and engineering 43 (1984) 231–247.
- Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems, Computer Methods in Applied Mechanics and Engineering 365 (2020). URL: https://www.osti.gov/biblio/1616479. doi:10.1016/j.cma.2020.113028.
- High-re solutions for incompressible flow using the navier-stokes equations and a multigrid method, Journal of Computational Physics 48 (1982) 387–411. URL: https://www.sciencedirect.com/science/article/pii/0021999182900584. doi:https://doi.org/10.1016/0021-9991(82)90058-4.
- R.-E. Plessix, A Helmholtz iterative solver for 3D seismic-imaging problems, Geophysics 72 (2007) SM185–SM194.
- The numerical solution of the helmholtz equation for wave propagation problems in underwater acoustics, Computers & Mathematics with Applications 11 (1985) 655–665.
- Three-dimensional forward calculation for loop source transient electromagnetic method based on electric field helmholtz equation, Chinese journal of geophysics 56 (2013) 4256–4267.
- Accelerating fast multipole methods for the Helmholtz equation at low frequencies, IEEE Computational Science and Engineering 5 (1998) 32–38.