Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Differentiability in Unrolled Training of Neural Physics Simulators on Transient Dynamics (2402.12971v2)

Published 20 Feb 2024 in physics.comp-ph and cs.LG

Abstract: Unrolling training trajectories over time strongly influences the inference accuracy of neural network-augmented physics simulators. We analyze this in three variants of training neural time-steppers. In addition to one-step setups and fully differentiable unrolling, we include a third, less widely used variant: unrolling without temporal gradients. Comparing networks trained with these three modalities disentangles the two dominant effects of unrolling, training distribution shift and long-term gradients. We present detailed study across physical systems, network sizes and architectures, training setups, and test scenarios. It also encompasses two simulation modes: In prediction setups, we rely solely on neural networks to compute a trajectory. In contrast, correction setups include a numerical solver that is supported by a neural network. Spanning these variations, our study provides the empirical basis for our main findings: Non-differentiable but unrolled training with a numerical solver in a correction setup can yield substantial improvements over a fully differentiable prediction setup not utilizing this solver. The accuracy of models trained in a fully differentiable setup differs compared to their non-differentiable counterparts. Differentiable ones perform best in a comparison among correction networks as well as among prediction setups. For both, the accuracy of non-differentiable unrolling comes close. Furthermore, we show that these behaviors are invariant to the physical system, the network architecture and size, and the numerical scheme. These results motivate integrating non-differentiable numerical simulators into training setups even if full differentiability is unavailable. We show the convergence rate of common architectures to be low compared to numerical algorithms. This motivates correction setups combining neural and numerical parts which utilize benefits of both.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. M. Abadi et al. Tensorflow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16), pages 265–283. USENIX Association, 2016.
  2. H. J. Bae and P. Koumoutsakos. Scientific multi-agent reinforcement learning for wall-models of turbulent flows. Nature Communications, 13(1):1443, 2022.
  3. Learning data-driven discretizations for partial differential equations. Proceedings of the National Academy of Sciences, 116(31):15344–15349, 2019.
  4. A perspective on machine learning methods in turbulence modeling. GAMM-Mitteilungen, 44(1):e202100002, 2021.
  5. Mikhail Belkin. Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation. Acta Numerica, 30:203–248, 2021.
  6. Airfrans: High fidelity computational fluid dynamics dataset for approximating reynolds-averaged navier–stokes solutions. Advances in Neural Information Processing Systems, 35:23463–23478, 2022.
  7. Message passing neural PDE solvers. In International Conference on Learning Representations, 2022.
  8. Robert Bridson. Fluid Simulation for Computer Graphics. CRC Press, 2015.
  9. Machine learning for fluid mechanics. Annual review of fluid mechanics, 52:477–508, 2020.
  10. Modern koopman theory for dynamical systems. arXiv:2102.12086, 2021.
  11. An optimization method for chaotic turbulent flow. Journal of Computational Physics, 457:111077, 2022.
  12. Exponential time differencing for stiff systems. Journal of Computational Physics, 176(2):430–455, 2002.
  13. Turbulence modeling in the age of data. Annual Review of Fluid Mechanics, 51:357–377, 2019.
  14. Scalarflow: a large-scale volumetric data set of real-world scalar transport flows for computer animation and machine learning. ACM Transactions on Graphics (TOG), 38(6):1–16, 2019.
  15. Lyapunov exponents of the kuramoto-sivashinsky pde. The ANZIAM Journal, 61(3):270–285, 2019. doi: 10.1017/S1446181119000105.
  16. Recurrent neural networks and koopman-based frameworks for temporal predictions in a low-order model of turbulence. International Journal of Heat and Fluid Flow, 90:108816, 2021. ISSN 0142-727X. doi: https://doi.org/10.1016/j.ijheatfluidflow.2021.108816.
  17. Computational methods for fluid dynamics. springer, 2019.
  18. Machine-learning methods for computational science and engineering. Computation, 8(1):15, 2020.
  19. Modeling the dynamics of pde systems with physics-constrained deep auto-regressive networks. Journal of Computational Physics, 403:109056, 2020.
  20. Transformers for modeling physical systems. Neural Networks, 146:272–289, 2022. doi: 10.1016/j.neunet.2021.11.022.
  21. Kolmogorov seminar on selected questions of analysis, pages 144–148. Springer, 2009. doi: 10.1007/978-3-642-01742-1˙8.
  22. Predicting physics in mesh-reduced space with temporal attention. In International Conference on Learning Representations, 2021.
  23. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  24. phiflow: A differentiable pde solving framework for deep learning via physical simulations. In NeurIPS workshop, volume 2, 2020.
  25. Difftaichi: Differentiable programming for physical simulation. arXiv:1910.00935, 2019.
  26. Raad I Issa. Solution of the implicitly discretised fluid flow equations by operator-splitting. Journal of computational physics, 62(1):40–65, 1986.
  27. Eagle: Large-scale learning of turbulent fluid dynamics with mesh transformers. In International Conference on Learning Representations, 2023.
  28. Machine learning–accelerated computational fluid dynamics. Proceedings of the National Academy of Sciences, 118(21):e2101784118, 2021.
  29. Turbulent flow simulation using autoregressive conditional diffusion models. arXiv:2309.01745, 2023.
  30. Graphcast: Learning skillful medium-range global weather forecasting. arXiv:2212.12794, 2022.
  31. Deepgcns: Can gcns go as deep as cnns? In Proceedings of the IEEE/CVF international conference on computer vision, pages 9267–9276, 2019.
  32. Transformer for partial differential equations’ operator learning. arXiv:2205.13671, 2022.
  33. Fourier neural operator for parametric partial differential equations. arXiv:2010.08895, 2020.
  34. Generative diffusion for 3d turbulent flows. arXiv:2306.01776, 2023.
  35. Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. Journal of Fluid Mechanics, 807:155–166, 2016.
  36. Pde-refiner: Achieving accurate long rollouts with neural pde solvers. arXiv:2308.05732, 2023.
  37. Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons. Journal of Fluid Mechanics, 949:A25, 2022. doi: 10.1017/jfm.2022.738.
  38. Deep learning for universal linear embeddings of nonlinear dynamics. Nature communications, 9(1):1–10, 2018.
  39. Embedded training of neural-network subgrid-scale turbulence models. Physical Review Fluids, 6(5):050502, 2021.
  40. G. Margazoglou and L. Magri. Stability analysis of chaotic systems from data. Nonlinear Dynamics, 111(9):8799–8819, 2023.
  41. Comparison of neural closure models for discretised pdes. Computers & Mathematics with Applications, 143:94–107, 2023.
  42. On the difficulty of learning chaotic dynamics with RNNs. Advances in Neural Information Processing Systems, 35:11297–11312, 2022.
  43. Automating turbulence modelling by multi-agent reinforcement learning. Nature Machine Intelligence, 3(1):87–96, 2021.
  44. Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999, 2018.
  45. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
  46. Learning mesh-based simulation with graph networks. arXiv:2010.03409, 2020.
  47. Guaranteed conservation of momentum for learning particle-based fluid dynamics. Advances in Neural Information Processing Systems, 35, 2022.
  48. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378:686–707, 2019.
  49. Christopher L Rumsey. CFL3D contribution to the AIAA supersonic shock boundary layer interaction workshop. NASA TM–2010-216858, 2010.
  50. CFL3D: its history and some recent applications. NASA TM–112861, 1997.
  51. Learning to simulate complex physics with graph networks. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 8459–8468. PMLR, 2020.
  52. Dpm: A deep learning pde augmentation method with application to large-eddy simulation. Journal of Computational Physics, 423:109811, 2020. ISSN 0021-9991.
  53. Learned coarse models for efficient turbulence simulation. arXiv:2112.15275, 2021.
  54. Do differentiable simulators give better policy gradients? In International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 20668–20696. PMLR, 17–23 Jul 2022.
  55. Ilya Sutskever. Training recurrent neural networks. University of Toronto Toronto, ON, Canada, 2013.
  56. Solver-in-the-loop: Learning from differentiable physics to interact with iterative pde-solvers. In Advances in Neural Information Processing Systems, pages 6111–6122. Curran Associates, Inc., 2020.
  57. Lagrangian fluid simulation with continuous convolutions. In International Conference on Learning Representations, 2019.
  58. Energy-conserving neural network for turbulence closure modeling. arXiv preprint arXiv:2301.13770, 2023.
  59. Multiscale simulations of complex systems by learning their effective dynamics. Nature Machine Intelligence, 4(4):359–366, 2022.
  60. Combining machine learning and simulation to a hybrid modelling approach: Current and future directions. In International Symposium on Intelligent Data Analysis, pages 548–560. Springer, 2020.
  61. Towards physics-informed deep learning for turbulent flow prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1457–1466, 2020.
  62. Koopman neural forecaster for time series with temporal distribution shifts. arXiv:2210.03675, 2022.
  63. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog), 38(5):1–12, 2019.
  64. B. L. Welch. The generalization of ‘student’s’ problem when several different population variances are involved. Biometrika, 34(1-2):28–35, 1947.
  65. Latent space physics: Towards learning the temporal evolution of fluid flow. In Computer graphics forum, volume 38(2), pages 71–82. Wiley Online Library, 2019.
  66. A fine-grained analysis on distribution shift. arXiv:2110.11328, 2021.
  67. Enforcing statistical constraints in generative adversarial networks for modeling chaotic dynamical systems. Journal of Computational Physics, 406:109209, 2020.
  68. Learning to accelerate partial differential equations via latent global evolution. Advances in Neural Information Processing Systems, 35:2240–2253, 2022.
  69. Fluidlab: A differentiable environment for benchmarking complex fluid manipulation. In International Conference on Learning Representations, 2022.
  70. B-pinns: Bayesian physics-informed neural networks for forward and inverse pde problems with noisy data. Journal of Computational Physics, 425:109913, 2021.
  71. Ensemble kalman method for learning turbulence models from indirect observation data. Journal of Fluid Mechanics, 949:A26, 2022.
  72. Combining direct and indirect sparse data for learning generalizable turbulence models. Journal of Computational Physics, page 112272, 2023.
Citations (2)

Summary

  • The paper demonstrates that temporal unrolling, using both differentiable (38% improvement) and non-differentiable (up to 4.5-fold gain) methods, significantly boosts simulator accuracy.
  • It empirically evaluates multiple physical systems and reveals that training with a curriculum and incremental unrolling stabilizes long-term gradient computations.
  • The research underscores that integrating neural networks with legacy numerical solvers via non-differentiable unrolling offers a scalable, cost-effective path for high-fidelity hybrid simulations.

Overview of Temporal Unrolling in Neural Physics Simulators

The paper by Bjoern List and colleagues examines the influence of temporal unrolling on the training of neural network-augmented physics simulators and its effect on inference accuracy. The research identifies and disentangles the two main influences of temporal unrolling: ameliorating training distribution shift and enabling long-term gradient computations. The authors compare three distinct training modalities: the commonly used one-step training, fully differentiable unrolling (referred to as WIG), and non-differentiable unrolling without temporal gradients (NOG).

Key Contributions

  1. Empirical Evaluation Across Systems and Architectures: Through rigorous empirical evaluation across various physical systems—including Kuramoto-Sivashinsky, Wake Flow, Kolmogorov Flow, and Compressible Aerofoil flow—the authors provide a comprehensive perspective on how temporal unrolling influences NN-augmented simulators' performance. They demonstrate that non-differentiable unrolling can offer significant improvements over standard differential setups.
  2. Implications for Neural Hybrid Simulators: The paper highlights that even when numerical solvers are not differentiable, interfacing these solvers with neural architectures using NOG techniques can lead to substantial accuracy improvements. This approach does not necessitate differentiable implementations, making it particularly applicable to environments with legacy numerical codes.
  3. Scalability and Convergence: The paper identifies a low convergence rate for neural architectures compared to traditional numerical algorithms, suggesting that a hybrid approach can exploit the scaling benefits of numerical solvers while incorporating the adaptability of neural networks. This insight is pivotal for large-scale scientific applications where cost-effective scaling is crucial.
  4. Use of Curriculums for Training Stability: The research stresses the necessity of employing a curriculum when training with unrolled setups. Incrementally increasing the number of unrolled steps and adjusting the learning rate are recommended practices, ensuring stable training and reliable gradient flows.

Numerical Results

The results show a consistent improvement with unrolled training setups across different systems and architectures:

  • A non-differentiable but unrolled setup shows, on average, a 4.5-fold improvement over fully differentiable prediction setups, demonstrating the significant effect of reducing training distribution shift.
  • Fully differentiable unrolling (WIG) consistently outperforms other methods in terms of inference accuracy, showing the value of long-term gradients—resulting in a 38% average improvement over one-step methods.
  • Prediction tasks, however, benefit less from differentiable unrolling due to their broader range of valid convergence scenarios favoring smaller architectures under NOG setups.

Theoretical and Practical Implications

The paper's implications are extensive, both practically and theoretically. Practically, the findings encourage the intersection of machine learning and existing numerical simulation infrastructures, suggesting that unrolled training does not only enhance performance but is also essential for leveraging legacy numerical solutions. Theoretically, it advances our understanding of neural network training dynamics in temporal sequence modeling, particularly under chaotic or turbulent conditions like those found in fluid dynamics.

Future Directions

Future research could explore the deployment of tailored architectures that could potentially enhance neural network scaling efficiencies. Additionally, investigating applications beyond fluid dynamics and varying domains could test the generality of the findings. Moreover, exploring the full potential of neural and numerical hybrid models may reveal new paradigms in data-driven physics simulations.

By providing a detailed exploration and a robust empirical paper, the paper significantly contributes to the current understanding of neural networks in physics-based modeling, offering both data analysis and methodological recommendations that could influence future developments in the field.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com