Hyena Neural Operator for Partial Differential Equations (2306.16524v2)
Abstract: Numerically solving partial differential equations typically requires fine discretization to resolve necessary spatiotemporal scales, which can be computationally expensive. Recent advances in deep learning have provided a new approach to solving partial differential equations that involves the use of neural operators. Neural operators are neural network architectures that learn mappings between function spaces and have the capability to solve partial differential equations based on data. This study utilizes a novel neural operator called Hyena, which employs a long convolutional filter that is parameterized by a multilayer perceptron. The Hyena operator is an operation that enjoys sub-quadratic complexity and state space model to parameterize long convolution that enjoys a global receptive field. This mechanism enhances the model's comprehension of the input's context and enables data-dependent weight for different partial differential equations instances. To measure how effective the layers are in solving partial differential equations, we conduct experiments on Diffusion-Reaction equation and Navier Stokes equation. Our findings indicate Hyena Neural operator can serve as an efficient and accurate model for learning partial differential equations solution operator. The data and code used can be found at: https://github.com/Saupatil07/Hyena-Neural-Operator
- Foucart, C.; Charous, A.; Lermusiaux, P. F. Deep Reinforcement Learning for Adaptive Mesh Refinement. arXiv preprint arXiv:2209.12351 2022,
- Yang, J.; Dzanic, T.; Petersen, B.; Kudo, J.; Mittal, K.; Tomov, V.; Camier, J.-S.; Zhao, T.; Zha, H.; Kolev, T.; Anderson, R.; Faissol, D. Reinforcement Learning for Adaptive Mesh Refinement. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. 2023; pp 5997–6014
- Farimani, A. B.; Gomes, J.; Pande, V. S. Deep Learning the Physics of Transport Phenomena. 2017
- Yang, G.; Sommer, S. A Denoising Diffusion Model for Fluid Field Prediction. 2023
- Wang, T.; Plechac, P.; Knap, J. Generative diffusion learning for parametric partial differential equations. 2023
- Jadhav, Y.; Berthel, J.; Hu, C.; Panat, R.; Beuth, J.; Barati Farimani, A. Stressd: 2d Stress Estimation Using Denoising Diffusion Model. Available at SSRN 4478596
- Lu, L.; Jin, P.; Karniadakis, G. E. Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193 2019,
- Wang, S.; Wang, H.; Perdikaris, P. Learning the solution operator of parametric partial differential equations with physics-informed DeepOnets. 2021
- Jin, P.; Meng, S.; Lu, L. MIONet: Learning multiple-input operators via tensor product. 2022
- Kovachki, N.; Li, Z.; Liu, B.; Azizzadenesheli, K.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Neural Operator: Learning Maps Between Function Spaces. 2023
- Li, Z.; Kovachki, N.; Azizzadenesheli, K.; Liu, B.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Neural Operator: Graph Kernel Network for Partial Differential Equations. 2020
- Li, Z.; Kovachki, N.; Azizzadenesheli, K.; Liu, B.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895 2020,
- Tran, A.; Mathews, A.; Xie, L.; Ong, C. S. Factorized Fourier Neural Operators. 2023
- Guibas, J.; Mardani, M.; Li, Z.; Tao, A.; Anandkumar, A.; Catanzaro, B. Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators. International Conference on Learning Representations. 2022
- Li, Z.; Zheng, H.; Kovachki, N.; Jin, D.; Chen, H.; Liu, B.; Azizzadenesheli, K.; Anandkumar, A. Physics-Informed Neural Operator for Learning Partial Differential Equations. 2023
- Tripura, T.; Chakraborty, S. Wavelet neural operator: a neural operator for parametric partial differential equations. arXiv preprint arXiv:2205.02191 2022,
- Li, Z.; Meidani, K.; Farimani, A. B. Transformer for partial differential equations’ operator learning. arXiv preprint arXiv:2205.13671 2022,
- Su, J.; Lu, Y.; Pan, S.; Murtadha, A.; Wen, B.; Liu, Y. RoFormer: Enhanced Transformer with Rotary Position Embedding. 2022
- Stachenfeld, K.; Fielding, D. B.; Kochkov, D.; Cranmer, M.; Pfaff, T.; Godwin, J.; Cui, C.; Ho, S.; Battaglia, P.; Sanchez-Gonzalez, A. Learned Coarse Models for Efficient Turbulence Simulation. 2022
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016
- Chen, C.-T. Linear system theory and design; Saunders college publishing, 1984
- Gu, A.; Goel, K.; Ré, C. Efficiently Modeling Long Sequences with Structured State Spaces. 2022
- Gu, A.; Gupta, A.; Goel, K.; Ré, C. On the Parameterization and Initialization of Diagonal State Space Models. 2022
- Gu, A.; Johnson, I.; Timalsina, A.; Rudra, A.; Ré, C. How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections. 2022
- Mehta, H.; Gupta, A.; Cutkosky, A.; Neyshabur, B. Long range language modeling via gated state spaces. arXiv preprint arXiv:2206.13947 2022,
- Dao, T.; Fu, D. Y.; Saab, K. K.; Thomas, A. W.; Rudra, A.; Ré, C. Hungry Hungry Hippos: Towards Language Modeling with State Space Models. arXiv preprint arXiv:2212.14052 2022,
- Poli, M.; Massaroli, S.; Nguyen, E.; Fu, D. Y.; Dao, T.; Baccus, S.; Bengio, Y.; Ermon, S.; Ré, C. Hyena Hierarchy: Towards Larger Convolutional Language Models. 2023
- Tay, Y.; Dehghani, M.; Abnar, S.; Shen, Y.; Bahri, D.; Pham, P.; Rao, J.; Yang, L.; Ruder, S.; Metzler, D. Long range arena: A benchmark for efficient transformers. arXiv preprint arXiv:2011.04006 2020,
- Sitzmann, V.; Martel, J. N. P.; Bergman, A. W.; Lindell, D. B.; Wetzstein, G. Implicit Neural Representations with Periodic Activation Functions. 2020
- Romero, D. W.; Kuzina, A.; Bekkers, E. J.; Tomczak, J. M.; Hoogendoorn, M. CKConv: Continuous Kernel Convolution For Sequential Data. 2022
- Ba, J. L.; Kiros, J. R.; Hinton, G. E. Layer Normalization. 2016
- Rahimi, A.; Recht, B. Random Features for Large-Scale Kernel Machines. Advances in Neural Information Processing Systems. 2007
- Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014,
- Loshchilov, I.; Hutter, F. SGDR: Stochastic Gradient Descent with Warm Restarts. 2017
- Hendrycks, D.; Gimpel, K. Gaussian Error Linear Units (GELUs). 2020
- Lorsung, C.; Li, Z.; Farimani, A. B. Physics Informed Token Transformer. arXiv preprint arXiv:2305.08757 2023,