Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scaling physics-informed hard constraints with mixture-of-experts (2402.13412v1)

Published 20 Feb 2024 in cs.LG, cs.AI, cs.NA, math.NA, and math.OC

Abstract: Imposing known physical constraints, such as conservation laws, during neural network training introduces an inductive bias that can improve accuracy, reliability, convergence, and data efficiency for modeling physical dynamics. While such constraints can be softly imposed via loss function penalties, recent advancements in differentiable physics and optimization improve performance by incorporating PDE-constrained optimization as individual layers in neural networks. This enables a stricter adherence to physical constraints. However, imposing hard constraints significantly increases computational and memory costs, especially for complex dynamical systems. This is because it requires solving an optimization problem over a large number of points in a mesh, representing spatial and temporal discretizations, which greatly increases the complexity of the constraint. To address this challenge, we develop a scalable approach to enforce hard physical constraints using Mixture-of-Experts (MoE), which can be used with any neural network architecture. Our approach imposes the constraint over smaller decomposed domains, each of which is solved by an "expert" through differentiable optimization. During training, each expert independently performs a localized backpropagation step by leveraging the implicit function theorem; the independence of each expert allows for parallelization across multiple GPUs. Compared to standard differentiable optimization, our scalable approach achieves greater accuracy in the neural PDE solver setting for predicting the dynamics of challenging non-linear systems. We also improve training stability and require significantly less computation time during both training and inference stages.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. OptNet: Differentiable Optimization as a Layer in Neural Networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp.  136–145. PMLR, 2017.
  2. Input convex neural networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp. 146–155. PMLR, 2017.
  3. Sequential quadratic optimization for nonlinear equality constrained stochastic optimization. SIAM Journal on Optimization, 31(2):1352–1379, 2021.
  4. Efficient and Modular Implicit Differentiation. In Advances in Neural Information Processing Systems, volume 35, pp.  5230–5242. Curran Associates, Inc., 2022.
  5. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax.
  6. Ulisses Braga-Neto. Characteristics-informed neural networks for forward and inverse hyperbolic problems, 2023.
  7. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, volume 33, pp.  1877–1901. Curran Associates, Inc., 2020.
  8. Neural ordinary differential equations. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
  9. End-to-end differentiable physics for learning and control. In Advances in Neural Information Processing Systems, volume 31, pp.  7178–7189. Curran Associates Inc., 2018.
  10. DC3: A learning method for optimization with hard constraints. In International Conference on Learning Representations, 2021.
  11. Neural spectral methods: Self-supervised learning in the spectral domain. arXiv preprint arXiv:2312.05225, 2023.
  12. Fully stochastic trust-region sequential quadratic programming for equality-constrained optimization problems. 2024.
  13. Partial Differential Equations in Fluid Dynamics. Cambridge University Press, 2008.
  14. (implicit)22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT: Implicit layers for implicit representations. In Advances in Neural Information Processing Systems, volume 34, pp.  9639–9650. Curran Associates, Inc., 2021.
  15. Learning Differential Equations that are Easy to Solve. In Advances in Neural Information Processing Systems, volume 33, pp.  4370–4380. Curran Associates, Inc., 2020.
  16. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836, 2016.
  17. Equinox: neural networks in JAX via callable PyTrees and filtered transformations. Differentiable Programming workshop at Neural Information Processing Systems 2021, 2021.
  18. End-to-end constrained optimization learning: A survey. arXiv preprint arXiv:2103.16378, 2021.
  19. Characterizing possible failure modes in physics-informed neural networks. In Advances in Neural Information Processing Systems, volume 34, pp.  26548–26560. Curran Associates, Inc., 2021.
  20. Fourier Neural Operator for Parametric Partial Differential Equations. arXiv preprint arXiv:2010.08895, 2021a.
  21. Physics-informed neural operator for learning partial differential equations. arXiv preprint arXiv:2111.03794, 2021b.
  22. Learned turbulence modelling with differentiable fluid solvers: physics-based loss functions and optimisation horizons. Journal of Fluid Mechanics, 949, 2022.
  23. Physics-informed neural networks with hard constraints for inverse design. SIAM Journal on Scientific Computing, 43(6):B1105–B1132, 2021.
  24. Learning neural constitutive laws from motion observations for generalizable pde dynamics. In International Conference on Machine Learning. PMLR, 2023.
  25. Boundary Graph Neural Networks for 3D Simulations. Proceedings of the AAAI Conference on Artificial Intelligence, 37(8):9099–9107, 2023.
  26. Spectral Normalization for Generative Adversarial Networks. arXiv preprint arXiv:1802.05957, 2018.
  27. On the convergence of overlapping schwarz decomposition for nonlinear optimal control. IEEE Transactions on Automatic Control, 67(11):5996–6011, 2022.
  28. Introduction to Optimization and Data Fitting. Informatics and Mathematical Modelling, Technical University of Denmark, DTU, 2010.
  29. Introduction to Theory and Applications of Turbulent Flows. Springer, 2016.
  30. Control-oriented model-based reinforcement learning with implicit differentiation. In Proceedings of the AAAI Conference on Artificial Intelligence, pp.  7886–7894, 2022.
  31. Learning differentiable solvers for systems with hard constraints. In The Eleventh International Conference on Learning Representations, 2023.
  32. Physics-enhanced deep surrogates for partial differential equations. Nature Machine Intelligence, 2023.
  33. Theseus: A Library for Differentiable Nonlinear Optimization. In Advances in Neural Information Processing Systems, volume 35, pp.  3801–3818. Curran Associates, Inc., 2022.
  34. Scalable Differentiable Physics for Learning and Control. In Proceedings of the 37th International Conference on Machine Learning, 2020.
  35. Optimistix. URL https://github.com/patrick-kidger/optimistix.
  36. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.
  37. Differentiable Physics: A Position Piece. arXiv preprint arXiv:2109.07573, 2021.
  38. Neural conservation laws: A divergence-free perspective. In Advances in Neural Information Processing Systems, 2022.
  39. Scaling Vision with Sparse Mixture of Experts. In Advances in Neural Information Processing Systems, 2021.
  40. Guiding continuous operator learning through Physics-based boundary constraints. In International Conference on Learning Representations, 2023.
  41. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. In International Conference on Learning Representations, 2017.
  42. Don’t decay the learning rate, increase the batch size. In International Conference on Learning Representations, 2018.
  43. Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks. Computer Methods in Applied Mechanics and Engineering, 389:114333, 2022. ISSN 0045-7825.
  44. Differentiable fluids with solid coupling for learning and control. Proceedings of the AAAI Conference on Artificial Intelligence, 35(7):6138–6146, 2021.
  45. PDEBench: An Extensive Benchmark for Scientific Machine Learning. In Advances in Neural Information Processing Systems, volume 35, pp.  1596–1611. Curran Associates, Inc., 2022.
  46. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020.
  47. Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing, 43(5):A3055–A3081, 2021.
  48. Fluidlab: A differentiable environment for benchmarking complex fluid manipulation. In International Conference on Learning Representations, 2023.
  49. Physics Constrained Learning for Data-driven Inverse Modeling from Sparse Observations. Journal of Computational Physics, 453:110938, 2022.
  50. Neural fields with hard constraints of arbitrary differential order. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Nithin Chalapathi (4 papers)
  2. Yiheng Du (6 papers)
  3. Aditi Krishnapriyan (6 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.

Reddit Logo Streamline Icon: https://streamlinehq.com