Training Deep Surrogate Models with Large Scale Online Learning (2306.16133v1)
Abstract: The spatiotemporal resolution of Partial Differential Equations (PDEs) plays important roles in the mathematical description of the world's physical phenomena. In general, scientists and engineers solve PDEs numerically by the use of computationally demanding solvers. Recently, deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. This paper advocates that relying on a traditional static dataset to train these models does not allow the full benefit of the solver to be used as a data generator. It proposes an open source online training framework for deep surrogate models. The framework implements several levels of parallelism focused on simultaneously generating numerical simulations and training deep neural networks. This approach suppresses the I/O and storage bottleneck associated with disk-loaded datasets, and opens the way to training on significantly larger datasets. Experiments compare the offline and online training of four surrogate models, including state-of-the-art architectures. Results indicate that exposing deep surrogate models to more dataset diversity, up to hundreds of GB, can increase model generalization capabilities. Fully connected neural networks, Fourier Neural Operator (FNO), and Message Passing PDE Solver prediction accuracy is improved by 68%, 16% and 7%, respectively.
- Dota 2 with large scale deep reinforcement learning. ArXiv preprint, abs/1912.06680, 2019.
- AirfRANS: High fidelity computational fluid dynamics dataset for approximating reynolds-averaged navier–stokes solutions. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
- Optimization methods for large-scale machine learning. Siam Review, 60(2):223–311, 2018.
- Coupling streaming ai and hpc ensembles to achieve 100–1000×\times× faster biomolecular simulations. In 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 806–816. IEEE, 2022.
- Message passing neural PDE solvers. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022, 2022.
- Language models are few-shot learners. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- Machine learning for fluid mechanics. Annual review of fluid mechanics, 52:477–508, 2020.
- Numerical analysis. Cengage learning, 2015.
- A batch scheduler with high level components. In CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005., volume 2, pp. 776–783. IEEE, 2005.
- Chakraborty, S. Transfer learning based multi-fidelity physics informed deep neural network. Journal of Computational Physics, 426:109942, 2021.
- Weighted random sampling with a reservoir. Information processing letters, 97(5):181–185, 2006.
- IMPALA: scalable distributed deep-rl with importance weighted actor-learner architectures. In Dy, J. G. and Krause, A. (eds.), Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pp. 1406–1415. PMLR, 2018.
- Computational methods for fluid dynamics, volume 3. Springer, 2002.
- An elastic framework for ensemble-based large-scale data assimilation. The international journal of high performance computing applications, 36(4):543–563, 2022.
- Nvidia simnet™: An ai-accelerated multi-physics simulation framework. In International Conference on Computational Science, pp. 447–461. Springer, 2021.
- Hintjens, P. ZeroMQ: messaging for many applications. " O’Reilly Media, Inc.", 2013.
- Distributed prioritized experience replay. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, 2018.
- Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, 2021.
- Measuring catastrophic forgetting in neural networks. In McIlraith, S. A. and Weinberger, K. Q. (eds.), Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pp. 3390–3398. AAAI Press, 2018.
- Machine learning–accelerated computational fluid dynamics. Proceedings of the National Academy of Sciences, 118(21):e2101784118, 2021.
- Characterizing possible failure modes in physics-informed neural networks. In Ranzato, M., Beygelzimer, A., Dauphin, Y. N., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pp. 26548–26560, 2021.
- Scalable hpc & ai infrastructure for covid-19 therapeutics. In Proceedings of the Platform for Advanced Scientific Computing Conference, pp. 1–13, 2021.
- Pytorch distributed: Experiences on accelerating data parallel training. Proceedings of the VLDB Endowment, 13(12), 2020.
- Fourier neural operator for parametric partial differential equations. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021.
- Rllib: Abstractions for distributed reinforcement learning. In Dy, J. G. and Krause, A. (eds.), Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pp. 3059–3068. PMLR, 2018.
- Lorenz, E. N. Deterministic nonperiodic flow. Journal of atmospheric sciences, 20(2):130–141, 1963.
- Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021.
- Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics, 456:111022, 2022.
- Pointer sentinel mixture models. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
- An extensible benchmark suite for learning to simulate physical systems. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021.
- Enabling machine learning-ready hpc ensembles with merlin. Future Generation Computer Systems, 131:255–268, 2022.
- Learning mesh-based simulation with graph networks. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021.
- Universal differential equations for scientific machine learning. ArXiv preprint, abs/2001.04385, 2020.
- Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378:686–707, 2019.
- Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science, 367(6481):1026–1030, 2020.
- A survey of deep active learning. ACM computing surveys (CSUR), 54(9):1–40, 2021.
- Unlocking Large Scale Uncertainty Quantification with In Transit Iterative Statistics, pp. 113–136. Springer International Publishing, Cham, 2022. ISBN 978-3-030-81627-8.
- U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Springer, 2015.
- Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3):211–252, 2015.
- E(n) equivariant graph neural networks. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp. 9323–9332. PMLR, 2021.
- Horovod: fast and easy distributed deep learning in tensorflow. ArXiv preprint, abs/1802.05799, 2018.
- Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics, 375:1339–1364, 2018.
- Ai for science: Report on the department of energy (doe) town halls on artificial intelligence (ai) for science. Technical report, Argonne National Lab.(ANL), Argonne, IL (United States), 2020.
- Continual learning autoencoder training for a particle-in-cell simulation via streaming. ArXiv preprint, abs/2211.04770, 2022.
- Pdebench: An extensive benchmark for scientific machine learning. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
- Melissa: large scale in transit sensitivity analysis avoiding intermediate files. In Proceedings of the international conference for high performance computing, networking, storage and analysis, pp. 1–14, 2017.
- Solver-in-the-loop: Learning from differentiable physics to interact with iterative pde-solvers. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- Learning incompressible fluid dynamics from scratch - towards fast, differentiable fluid models that generalize. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021.
- Towards physics-informed deep learning for turbulent flow prediction. In Gupta, R., Liu, Y., Tang, J., and Prakash, B. A. (eds.), KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020, pp. 1457–1466. ACM, 2020.
- Slurm: Simple linux utility for resource management. In Workshop on job scheduling strategies for parallel processing, pp. 44–60. Springer, 2003.
- Large batch optimization for deep learning: Training BERT in 76 minutes. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, 2020.
- Lucas Meyer (6 papers)
- Marc Schouler (3 papers)
- Robert Alexander Caulk (3 papers)
- Alejandro Ribés (5 papers)
- Bruno Raffin (9 papers)