Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hyena Neural Operator for Partial Differential Equations (2306.16524v2)

Published 28 Jun 2023 in cs.LG, cs.AI, cs.NA, and math.NA

Abstract: Numerically solving partial differential equations typically requires fine discretization to resolve necessary spatiotemporal scales, which can be computationally expensive. Recent advances in deep learning have provided a new approach to solving partial differential equations that involves the use of neural operators. Neural operators are neural network architectures that learn mappings between function spaces and have the capability to solve partial differential equations based on data. This study utilizes a novel neural operator called Hyena, which employs a long convolutional filter that is parameterized by a multilayer perceptron. The Hyena operator is an operation that enjoys sub-quadratic complexity and state space model to parameterize long convolution that enjoys a global receptive field. This mechanism enhances the model's comprehension of the input's context and enables data-dependent weight for different partial differential equations instances. To measure how effective the layers are in solving partial differential equations, we conduct experiments on Diffusion-Reaction equation and Navier Stokes equation. Our findings indicate Hyena Neural operator can serve as an efficient and accurate model for learning partial differential equations solution operator. The data and code used can be found at: https://github.com/Saupatil07/Hyena-Neural-Operator

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Foucart, C.; Charous, A.; Lermusiaux, P. F. Deep Reinforcement Learning for Adaptive Mesh Refinement. arXiv preprint arXiv:2209.12351 2022,
  2. Yang, J.; Dzanic, T.; Petersen, B.; Kudo, J.; Mittal, K.; Tomov, V.; Camier, J.-S.; Zhao, T.; Zha, H.; Kolev, T.; Anderson, R.; Faissol, D. Reinforcement Learning for Adaptive Mesh Refinement. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. 2023; pp 5997–6014
  3. Farimani, A. B.; Gomes, J.; Pande, V. S. Deep Learning the Physics of Transport Phenomena. 2017
  4. Yang, G.; Sommer, S. A Denoising Diffusion Model for Fluid Field Prediction. 2023
  5. Wang, T.; Plechac, P.; Knap, J. Generative diffusion learning for parametric partial differential equations. 2023
  6. Jadhav, Y.; Berthel, J.; Hu, C.; Panat, R.; Beuth, J.; Barati Farimani, A. Stressd: 2d Stress Estimation Using Denoising Diffusion Model. Available at SSRN 4478596
  7. Lu, L.; Jin, P.; Karniadakis, G. E. Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193 2019,
  8. Wang, S.; Wang, H.; Perdikaris, P. Learning the solution operator of parametric partial differential equations with physics-informed DeepOnets. 2021
  9. Jin, P.; Meng, S.; Lu, L. MIONet: Learning multiple-input operators via tensor product. 2022
  10. Kovachki, N.; Li, Z.; Liu, B.; Azizzadenesheli, K.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Neural Operator: Learning Maps Between Function Spaces. 2023
  11. Li, Z.; Kovachki, N.; Azizzadenesheli, K.; Liu, B.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Neural Operator: Graph Kernel Network for Partial Differential Equations. 2020
  12. Li, Z.; Kovachki, N.; Azizzadenesheli, K.; Liu, B.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895 2020,
  13. Tran, A.; Mathews, A.; Xie, L.; Ong, C. S. Factorized Fourier Neural Operators. 2023
  14. Guibas, J.; Mardani, M.; Li, Z.; Tao, A.; Anandkumar, A.; Catanzaro, B. Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators. International Conference on Learning Representations. 2022
  15. Li, Z.; Zheng, H.; Kovachki, N.; Jin, D.; Chen, H.; Liu, B.; Azizzadenesheli, K.; Anandkumar, A. Physics-Informed Neural Operator for Learning Partial Differential Equations. 2023
  16. Tripura, T.; Chakraborty, S. Wavelet neural operator: a neural operator for parametric partial differential equations. arXiv preprint arXiv:2205.02191 2022,
  17. Li, Z.; Meidani, K.; Farimani, A. B. Transformer for partial differential equations’ operator learning. arXiv preprint arXiv:2205.13671 2022,
  18. Su, J.; Lu, Y.; Pan, S.; Murtadha, A.; Wen, B.; Liu, Y. RoFormer: Enhanced Transformer with Rotary Position Embedding. 2022
  19. Stachenfeld, K.; Fielding, D. B.; Kochkov, D.; Cranmer, M.; Pfaff, T.; Godwin, J.; Cui, C.; Ho, S.; Battaglia, P.; Sanchez-Gonzalez, A. Learned Coarse Models for Efficient Turbulence Simulation. 2022
  20. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016
  21. Chen, C.-T. Linear system theory and design; Saunders college publishing, 1984
  22. Gu, A.; Goel, K.; Ré, C. Efficiently Modeling Long Sequences with Structured State Spaces. 2022
  23. Gu, A.; Gupta, A.; Goel, K.; Ré, C. On the Parameterization and Initialization of Diagonal State Space Models. 2022
  24. Gu, A.; Johnson, I.; Timalsina, A.; Rudra, A.; Ré, C. How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections. 2022
  25. Mehta, H.; Gupta, A.; Cutkosky, A.; Neyshabur, B. Long range language modeling via gated state spaces. arXiv preprint arXiv:2206.13947 2022,
  26. Dao, T.; Fu, D. Y.; Saab, K. K.; Thomas, A. W.; Rudra, A.; Ré, C. Hungry Hungry Hippos: Towards Language Modeling with State Space Models. arXiv preprint arXiv:2212.14052 2022,
  27. Poli, M.; Massaroli, S.; Nguyen, E.; Fu, D. Y.; Dao, T.; Baccus, S.; Bengio, Y.; Ermon, S.; Ré, C. Hyena Hierarchy: Towards Larger Convolutional Language Models. 2023
  28. Tay, Y.; Dehghani, M.; Abnar, S.; Shen, Y.; Bahri, D.; Pham, P.; Rao, J.; Yang, L.; Ruder, S.; Metzler, D. Long range arena: A benchmark for efficient transformers. arXiv preprint arXiv:2011.04006 2020,
  29. Sitzmann, V.; Martel, J. N. P.; Bergman, A. W.; Lindell, D. B.; Wetzstein, G. Implicit Neural Representations with Periodic Activation Functions. 2020
  30. Romero, D. W.; Kuzina, A.; Bekkers, E. J.; Tomczak, J. M.; Hoogendoorn, M. CKConv: Continuous Kernel Convolution For Sequential Data. 2022
  31. Ba, J. L.; Kiros, J. R.; Hinton, G. E. Layer Normalization. 2016
  32. Rahimi, A.; Recht, B. Random Features for Large-Scale Kernel Machines. Advances in Neural Information Processing Systems. 2007
  33. Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014,
  34. Loshchilov, I.; Hutter, F. SGDR: Stochastic Gradient Descent with Warm Restarts. 2017
  35. Hendrycks, D.; Gimpel, K. Gaussian Error Linear Units (GELUs). 2020
  36. Lorsung, C.; Li, Z.; Farimani, A. B. Physics Informed Token Transformer. arXiv preprint arXiv:2305.08757 2023,
Citations (4)

Summary

We haven't generated a summary for this paper yet.