Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks (2402.00326v3)

Published 1 Feb 2024 in cs.LG, cs.NA, and math.NA

Abstract: While physics-informed neural networks (PINNs) have become a popular deep learning framework for tackling forward and inverse problems governed by partial differential equations (PDEs), their performance is known to degrade when larger and deeper neural network architectures are employed. Our study identifies that the root of this counter-intuitive behavior lies in the use of multi-layer perceptron (MLP) architectures with non-suitable initialization schemes, which result in poor trainablity for the network derivatives, and ultimately lead to an unstable minimization of the PDE residual loss. To address this, we introduce Physics-informed Residual Adaptive Networks (PirateNets), a novel architecture that is designed to facilitate stable and efficient training of deep PINN models. PirateNets leverage a novel adaptive residual connection, which allows the networks to be initialized as shallow networks that progressively deepen during training. We also show that the proposed initialization scheme allows us to encode appropriate inductive biases corresponding to a given PDE system into the network architecture. We provide comprehensive empirical evidence showing that PirateNets are easier to optimize and can gain accuracy from considerably increased depth, ultimately achieving state-of-the-art results across various benchmarks. All code and data accompanying this manuscript will be made publicly available at \url{https://github.com/PredictiveIntelligenceLab/jaxpi}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (89)
  1. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  2. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Physical review letters, 120(14):143001, 2018.
  3. Graphcast: Learning skillful medium-range global weather forecasting. arXiv preprint arXiv:2212.12794, 2022.
  4. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021.
  5. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
  6. Gauge equivariant convolutional networks and the icosahedral cnn. In International conference on Machine learning, pages 1321–1330. PMLR, 2019.
  7. On the universality of invariant networks. In International conference on machine learning, pages 4363–4371. PMLR, 2019.
  8. Invariant and equivariant graph networks. arXiv preprint arXiv:1812.09902, 2018.
  9. Embedding hard physical constraints in neural network coarse-graining of 3d turbulence. arXiv preprint arXiv:2002.00021, 2020.
  10. Overcoming the curse of dimensionality for some hamilton–jacobi partial differential equations via neural network architectures. Research in the Mathematical Sciences, 7:1–50, 2020.
  11. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science, 367(6481):1026–1030, 2020.
  12. Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data. Computer Methods in Applied Mechanics and Engineering, 361:112732, 2020.
  13. Uncovering turbulent plasma dynamics via deep learning from partial observations. Physical Review E, 104(2):025205, 2021.
  14. Physics-informed neural networks for cardiac activation mapping. Frontiers in Physics, 8:42, 2020.
  15. Machine learning in cardiovascular flows modeling: Predicting arterial blood pressure from non-invasive 4D flow MRI data using physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering, 358:112623, 2020.
  16. Deep physical informed neural networks for metamaterial design. IEEE Access, 8:24506–24513, 2019.
  17. Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Optics express, 28(8):11618–11633, 2020.
  18. Analyses of internal structures and defects in materials using physics-informed neural networks. Science advances, 8(7):eabk0644, 2022.
  19. Extraction of material properties through multi-fidelity deep learning from molecular dynamics simulation. Computational Materials Science, 188:110187, 2021.
  20. Conditional physics informed neural networks. Communications in Nonlinear Science and Numerical Simulation, 104:106041, 2022.
  21. Zhiwei Fang. A high-efficient hybrid physics-informed neural networks based on convolutional neural network. IEEE Transactions on Neural Networks and Learning Systems, 33(10):5514–5526, 2021.
  22. Sciann: A keras/tensorflow wrapper for scientific computations and physics-informed deep learning using artificial neural networks. Computer Methods in Applied Mechanics and Engineering, 373:113552, 2021.
  23. Hyposvi: Hypocentre inversion with stein variational inference and physics informed neural networks. Geophysical Journal International, 228(1):698–710, 2022.
  24. Nvidia simnet^{{\{{TM}}\}}: an ai-accelerated multi-physics simulation framework. arXiv preprint arXiv:2012.07938, 2020.
  25. Physics-informed neural networks for heat transfer problems. Journal of Heat Transfer, 143(6), 2021.
  26. On the spectral bias of neural networks. In International Conference on Machine Learning, pages 5301–5310, 2019.
  27. On the eigenvector bias of fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering, 384:113938, 2021.
  28. Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing, 43(5):A3055–A3081, 2021.
  29. When and why PINNs fail to train: A neural tangent kernel perspective. Journal of Computational Physics, 449:110768, 2022.
  30. Respecting causality is all you need for training physics-informed neural networks. arXiv preprint arXiv:2203.07404, 2022.
  31. Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544, 2020.
  32. Inverse dirichlet weighting enables reliable training of physics informed neural networks. Machine Learning: Science and Technology, 3(1):015026, 2022.
  33. Efficient training of physics-informed neural networks via importance sampling. Computer-Aided Civil and Infrastructure Engineering, 2021.
  34. Rethinking the importance of sampling in physics-informed neural networks. arXiv preprint arXiv:2207.02338, 2022.
  35. A comprehensive study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering, 403:115671, 2023.
  36. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics, 404:109136, 2020.
  37. Multi-scale deep neural network (MscaleDNN) for solving Poisson-Boltzmann equation in complex domains. arXiv preprint arXiv:2007.11207, 2020.
  38. Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems, 33:7462–7473, 2020.
  39. Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics, 428:110079, 2021.
  40. Multiplicative filter networks. In International Conference on Learning Representations, 2021.
  41. Finite basis physics-informed neural networks (fbpinns): a scalable domain decomposition approach for solving differential equations. arXiv preprint arXiv:2107.07871, 2021.
  42. Pixel: Physics-informed cell representations for fast and accurate pde solvers. arXiv preprint arXiv:2207.12800, 2022.
  43. Can-pinn: A fast physics-informed neural network based on coupled-automatic–numerical differentiation method. Computer Methods in Applied Mechanics and Engineering, 395:114909, 2022.
  44. Efficient physics-informed neural networks using hash encoding. Journal of Computational Physics, page 112760, 2024.
  45. hp-vpinns: Variational physics-informed neural networks with domain decomposition. Computer Methods in Applied Mechanics and Engineering, 374:113547, 2021.
  46. Thermodynamically consistent physics-informed neural networks for hyperbolic systems. Journal of Computational Physics, 449:110754, 2022.
  47. Gradient-enhanced physics-informed neural networks for forward and inverse pde problems. Computer Methods in Applied Mechanics and Engineering, 393:114823, 2022.
  48. Sobolev training for physics informed neural networks. arXiv preprint arXiv:2101.08932, 2021.
  49. Solving Allen-Cahn and Cahn-Hilliard equations using the adaptive physics informed neural networks. arXiv preprint arXiv:2007.04542, 2020.
  50. Characterizing possible failure modes in physics-informed neural networks. arXiv preprint arXiv:2109.01050, 2021.
  51. One-shot transfer learning of physics-informed neural networks. arXiv preprint arXiv:2110.11286, 2021.
  52. Transfer learning enhanced physics informed neural network for phase-field modeling of fracture. Theoretical and Applied Fracture Mechanics, 106:102447, 2020.
  53. Souvik Chakraborty. Transfer learning based multi-fidelity physics informed deep neural network. Journal of Computational Physics, 426:109942, 2021.
  54. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.
  55. Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM, 2008.
  56. Fourier features let networks learn high frequency functions in low dimensional domains. arXiv preprint arXiv:2006.10739, 2020.
  57. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  58. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  59. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  60. About optimal loss function for training physics-informed neural networks under respecting causality. arXiv preprint arXiv:2304.02282, 2023.
  61. Residual-based attention and connection to information bottleneck theory in pinns. arXiv preprint arXiv:2307.00379, 2023.
  62. An expert’s guide to training physics-informed neural networks. arXiv preprint arXiv:2308.08468, 2023.
  63. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  64. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  65. Francis Begnaud Hildebrand. Introduction to numerical analysis. Courier Corporation, 1987.
  66. Estimates on the generalization error of physics-informed neural networks for approximating a class of inverse problems for pdes. IMA Journal of Numerical Analysis, 42(2):981–1022, 2022.
  67. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249–256, 2010.
  68. Residual gates: A simple mechanism for improved network optimization. In Proc. Int. Conf. Learn. Representations, 2017.
  69. Identity mappings in deep residual networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pages 630–645. Springer, 2016.
  70. Physics informed extreme learning machine (pielm)–a rapid method for the numerical solution of partial differential equations. Neurocomputing, 391:96–118, 2020.
  71. Local extreme learning machines and domain decomposition for solving linear and nonlinear partial differential equations. Computer Methods in Applied Mechanics and Engineering, 387:114129, 2021.
  72. Random weight factorization improves the training of continuous neural representations. arXiv preprint arXiv:2210.01274, 2022.
  73. A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics, 435:110242, 2021.
  74. A novel sequential method to train physics informed neural networks for allen cahn and cahn hilliard equations. Computer Methods in Applied Mechanics and Engineering, 390:114474, 2022.
  75. Interaction of" solitons" in a collisionless plasma and the recurrence of initial states. Physical review letters, 15(6):240, 1965.
  76. Solution multiplicity and effects of data and eddy viscosity on navier-stokes solutions inferred by physics-informed neural networks. arXiv preprint arXiv:2309.06010, 2023.
  77. Tsonn: Time-stepping-oriented neural network for solving partial differential equations. arXiv preprint arXiv:2310.16491, 2023.
  78. High-re solutions for incompressible flow using the navier-stokes equations and a multigrid method. Journal of Computational Physics, 48(3):387–411, 1982.
  79. JAX: composable transformations of Python+NumPy programs, 2018.
  80. Machine learning–accelerated computational fluid dynamics. Proceedings of the National Academy of Sciences, 118(21), 2021.
  81. John D Hunter. Matplotlib: A 2D graphics environment. IEEE Annals of the History of Computing, 9(03):90–95, 2007.
  82. Array programming with numpy. Nature, 585(7825):357–362, 2020.
  83. Lawrence C Evans. Partial differential equations, volume 19. American Mathematical Society, 2022.
  84. Rayleigh-ritz-galerkin methods for dirichlet’s problem using subspaces without boundary conditions. Communications on Pure and Applied Mathematics, 23(4):653–675, 1970.
  85. Nikolaǐ Vladimirovich Krylov. Lectures on elliptic and parabolic equations in Holder spaces. Number 12. American Mathematical Soc., 1996.
  86. Irena Lasiecka. Boundary control of parabolic systems: finite-element approximation. Applied Mathematics and Optimization, 6:31–62, 1980.
  87. Irena Lasiecka. Ritz–galerkin approximation of the time optimal boundary control problem for parabolic systems with dirichlet boundary conditions. SIAM journal on control and optimization, 22(3):477–500, 1984.
  88. Chebfun guide, 2014.
  89. Exponential time differencing for stiff systems. Journal of Computational Physics, 176(2):430–455, 2002.
Citations (13)

Summary

  • The paper introduces PirateNets, an adaptive residual network that mitigates improper initialization in deep PINNs.
  • It provides theoretical proofs and extensive empirical results demonstrating improved accuracy on benchmark PDEs such as the Allen-Cahn and Korteweg–de Vries equations.
  • The study suggests practical applications in high-fidelity simulations for fluid dynamics, weather modeling, and material science.

PirateNets: Physics-Informed Deep Learning with Residual Adaptive Networks

The academic paper, "PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks," introduces a novel architecture designed to address several challenges associated with training Physics-Informed Neural Networks (PINNs). PINNs have demonstrated significant potential in solving forward and inverse problems governed by partial differential equations (PDEs). However, as the paper points out, traditional PINNs encounter issues, especially when scaling to deeper network architectures. This paper dissects these issues and proposes the Physics-Informed Residual Adaptive Networks (PirateNets) as a robust solution to enable more stable and efficient training.

Key Contributions and Approach

The paper makes several key contributions:

  1. Identification of Initialization Pathologies: The researchers identify that the degradation in training efficiency and stability in deeper PINNs stems from unsuitable initialization schemes used in multi-layer perceptron (MLP) architectures. This poor initialization leads to instability in minimizing PDE residual loss.
  2. Introduction of PirateNets: The paper presents PirateNets, which incorporate an adaptive residual connection allowing networks to be initialized as shallow and progressively deepen during training. This architecture mitigates initialization problems and enhances the model's ability to train effectively on deeper neural networks.
  3. Theoretical and Empirical Validation: The paper provides both theoretical justification and extensive empirical results to support the efficacy of PirateNets. They introduce a new initialization scheme that integrates physical priors directly into the model, demonstrated through comprehensive numerical experiments.

Theoretical Underpinning

The manuscript explores the theoretical aspects underlying the behavior of PINNs, proposing that the trainability issues in deeper models are due to the malfunctioning MLP derivative networks during initialization. They argue that the capacity to minimize PDE residuals is contingent on the ability of the network's derivatives to be accurately represented and optimized. This argument is supported via rigorous proofs, particularly highlighting second-order linear elliptic and parabolic PDEs. The convergence in training error is shown to lead to the convergence in both the solution and its derivatives, contingent on appropriate initialization.

Experimental Results

The empirical studies demonstrate that PirateNets offer consistent improvements in accuracy, robustness, and scalability across various benchmark PDEs. Specifically:

  • Allen-Cahn Equation: The relative L2L^2 error achieved by PirateNet is 2.24×1052.24 \times 10^{-5}, outperforming other state-of-the-art PINN architectures significantly.
  • Korteweg–De Vries Equation: Here, PirateNet achieved a relative L2L^2 error of 4.27×1044.27 \times 10^{-4}, marking a considerable improvement over previous approaches.
  • Grey-Scott Reaction-Diffusion System: The predicted solutions for both uu and vv components match closely with ground truth, demonstrating the model's ability to handle complex pattern formations.
  • Ginzburg-Landau Equation: For both the real and imaginary components, PirateNet achieves errors of 1.49×1021.49 \times 10^{-2} and 1.90×1021.90 \times 10^{-2}, respectively, showing superior performance over traditional models.
  • Lid-driven Cavity Flow at High Reynolds Number: With a relative L2L^2 error of 4.21×1024.21 \times 10^{-2}, PirateNet shows significant robustness and accuracy in simulating incompressible fluid dynamics at a high Reynolds number.

Practical Implications and Future Directions

The implications of this research are multi-faceted. Practically, PirateNets can be applied in scenarios requiring high-fidelity simulations governed by PDEs, such as fluid dynamics, weather modeling, and material science. The adaptive nature of the network makes it particularly suited for problems where the scale and complexity necessitate deep and expressive models.

Theoretically, this paper lays the groundwork for further exploration into network initialization and architecture design in the context of physics-informed learning. The incorporation of physical priors at the initialization phase offers a novel method for enhancing the model's inductive biases, leading to more robust training dynamics.

Future Work

Looking forward, potential developments in AI could involve optimizing the coordinate embeddings tailored to specific PDE systems to further improve the efficiency and accuracy of PINNs. Extending the principles of physics-informed initialization to domain-specific neural operators for solving parametric PDEs presents another promising direction. These advancements would not only refine the models themselves but also enhance the practical deployment and reliability of physics-informed machine learning in scientific and engineering applications.

Overall, PirateNets represent a significant advancement in the field of physics-informed machine learning, providing a robust and scalable framework to tackle complex PDE-driven problems with improved stability and accuracy.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com