Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Speeding up and reducing memory usage for scientific machine learning via mixed precision (2401.16645v1)

Published 30 Jan 2024 in cs.LG

Abstract: Scientific machine learning (SciML) has emerged as a versatile approach to address complex computational science and engineering problems. Within this field, physics-informed neural networks (PINNs) and deep operator networks (DeepONets) stand out as the leading techniques for solving partial differential equations by incorporating both physical equations and experimental data. However, training PINNs and DeepONets requires significant computational resources, including long computational times and large amounts of memory. In search of computational efficiency, training neural networks using half precision (float16) rather than the conventional single (float32) or double (float64) precision has gained substantial interest, given the inherent benefits of reduced computational time and memory consumed. However, we find that float16 cannot be applied to SciML methods, because of gradient divergence at the start of training, weight updates going to zero, and the inability to converge to a local minima. To overcome these limitations, we explore mixed precision, which is an approach that combines the float16 and float32 numerical formats to reduce memory usage and increase computational speed. Our experiments showcase that mixed precision training not only substantially decreases training times and memory demands but also maintains model accuracy. We also reinforce our empirical observations with a theoretical analysis. The research has broad implications for SciML in various computational applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Steven L. Brunton and J.Nathan Kutz “Machine Learning for Partial Differential Equations” In arXiv preprint arXiv:2303.17078, 2023
  2. “Physics-Informed Neural Networks for Heat Transfer Problems” In Journal of Heat Transfer 143.6, 2021
  3. “Physics-informed neural networks for inverse problems in nano-optics and metamaterials” In Optics Express 28.8, 2020, pp. 11618–11633
  4. “Neural operator prediction of linear instability waves in high-speed boundary layers” In Journal of Computational Physics 474, 2023, pp. 111793
  5. “Systems Biology: Identifiability Analysis and Parameter Identification via Systems-Biology-Informed Neural Networks” In Methods in Molecular Biology (Clifton, N.J.) 2634, 2023, pp. 87–105
  6. “A Survey of Quantization Methods for Efficient Neural Network Inference” Chapman & Hall, 2022, pp. 291–326
  7. “Learning Both Weights and Connections for Efficient Neural Networks” In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS’15 Montreal, Canada: MIT Press, 2015, pp. 1135–1143
  8. Geoffrey Hinton, Oriol Vinyals and Jeffrey Dean “Distilling the Knowledge in a Neural Network” In NIPS Deep Learning and Representation Learning Workshop, 2015
  9. “Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes” In arXiv preprint arXiv 1807.11205 [cs.LG], 2018
  10. “Fourier-MIONet: Fourier-enhanced multiple-input neural operators for multiphase modeling of geological carbon sequestration” In arXiv preprint arXiv:2303.04778, 2023
  11. “Physics-informed machine learning” Number: 6 Publisher: Nature Publishing Group In Nature Reviews Physics 3.6, 2021, pp. 422–440
  12. “Adam: A Method for Stochastic Optimization” In International Conference on Learning Representations, 2014
  13. “Visualizing the Loss Landscape of Neural Nets” In Neural Information Processing Systems, 2018
  14. “Fourier Neural Operator for Parametric Partial Differential Equations” In arXiv preprint arXiv:2010.08895 [cs, math] arXiv, 2021
  15. “A comprehensive and fair comparison of two neural operators (with practical extensions) based on FAIR data” In Computer Methods in Applied Mechanics and Engineering 393, 2022, pp. 114778
  16. “DeepXDE: A Deep Learning Library for Solving Differential Equations” In SIAM Review 63.1, 2021, pp. 208–228
  17. “Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators” Number: 3 Publisher: Nature Publishing Group In Nature Machine Intelligence 3.3, 2021, pp. 218–229
  18. “Physics-Informed Neural Networks with Hard Constraints for Inverse Design” In SIAM Journal on Scientific Computing 43, 2021, pp. B1105–B1132
  19. “Data-driven physics-informed constitutive metamodeling of complex fluids: A multifidelity neural network (MFNN) framework” In Journal of Rheology 65, 2021, pp. 179–198
  20. “Mixed Precision Training” In International Conference on Learning Representations, 2018
  21. “Mixed precision — TensorFlow Core” In TensorFlow URL: https://www.tensorflow.org/guide/mixed_precision
  22. Nvidia “Train With Mixed Precision” URL: https://docs.nvidia.com/deeplearning/performance/mixed-precision-training/index.html
  23. Guofei Pang, Lu Lu and George Em Karniadakis “fPINNs: Fractional Physics-Informed Neural Networks” Publisher: Society for Industrial and Applied Mathematics In SIAM Journal on Scientific Computing 41.4, 2019, pp. A2603–A2626
  24. PyTorch “AUTOMATIC MIXED PRECISION PACKAGE - TORCH.AMP” URL: https://pytorch.org/docs/stable/amp.html
  25. M. Raissi, P. Perdikaris and G.E. Karniadakis “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations” In Journal of Computational Physics 378, 2019, pp. 686–707
  26. “Physics-Informed Neural Networks for Cardiac Activation Mapping” In Frontiers in Physics 8, 2020
  27. “Physics-Informed Neural Network for Ultrasound Nondestructive Quantification of Surface Breaking Cracks” In Journal of Nondestructive Evaluation 39, 2020
  28. “Speeding up Fourier Neural Operators via Mixed Precision” In arXiv preprint arXiv 2307.15034 [cs.LG], 2023
  29. “A comprehensive study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks” In Computer Methods in Applied Mechanics and Engineering 403, 2023, pp. 115671
  30. “Data-driven Modeling of Hemodynamics and its Role on Thrombus Size and Shape in Aortic Dissections” In Scientific Reports 8, 2018, pp. 2515
  31. “Systems biology informed deep learning for inferring parameters and hidden dynamics” In PLOS Computational Biology 16.11, 2020, pp. e1007575
  32. “Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems” In Computer Methods in Applied Mechanics and Engineering 393, 2022, pp. 114823
  33. “In Defense of Pure 16-bit Floating-Point Neural Networks” In arXiv preprint arXiv:2305.10947 [cs.LG], 2023
  34. “Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems” In Journal of Computational Physics 397 Elsevier, 2019, pp. 108850
  35. Xingyu Zhou “On the Fenchel Duality between Strong Convexity and Lipschitz Continuous Gradient” In arXiv preprint arXiv 1803.06573 [math.OC], 2018
  36. “Fourier-DeepONet: Fourier-enhanced deep operator networks for full waveform inversion with improved accuracy, generalizability, and robustness” In Computer Methods in Applied Mechanics and Engineering 116300 Elsevier, 2023
  37. “Bayesian deep convolutional encoder–decoder networks for surrogate modeling and uncertainty quantification” In Journal of Computational Physics 366, 2018, pp. 415–447
  38. “Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data” In Journal of Computational Physics 394, 2019, pp. 56–81
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Youtube Logo Streamline Icon: https://streamlinehq.com