Multi-Resolution Active Learning of Fourier Neural Operators (2309.16971v4)
Abstract: Fourier Neural Operator (FNO) is a popular operator learning framework. It not only achieves the state-of-the-art performance in many tasks, but also is efficient in training and prediction. However, collecting training data for the FNO can be a costly bottleneck in practice, because it often demands expensive physical simulations. To overcome this problem, we propose Multi-Resolution Active learning of FNO (MRA-FNO), which can dynamically select the input functions and resolutions to lower the data cost as much as possible while optimizing the learning efficiency. Specifically, we propose a probabilistic multi-resolution FNO and use ensemble Monte-Carlo to develop an effective posterior inference algorithm. To conduct active learning, we maximize a utility-cost ratio as the acquisition function to acquire new examples and resolutions at each step. We use moment matching and the matrix determinant lemma to enable tractable, efficient utility computation. Furthermore, we develop a cost annealing framework to avoid over-penalizing high-resolution queries at the early stage. The over-penalization is severe when the cost difference is significant between the resolutions, which renders active learning often stuck at low-resolution queries and inferior performance. Our method overcomes this problem and applies to general multi-fidelity active learning and optimization problems. We have shown the advantage of our method in several benchmark operator learning tasks. The code is available at https://github.com/shib0li/MRA-FNO.
- Deep batch active learning by diverse, uncertain gradient lower bounds. In International Conference on Learning Representations.
- Kernel methods are competitive for operator learning. arXiv preprint arXiv:2304.13202.
- Operator learning for nonlinear adaptive control. In Learning for Dynamics and Control Conference, pages 346–357. PMLR.
- Pattern recognition and machine learning, volume 4. Springer.
- Adversarial active learning for deep networks: a margin based approach. arXiv preprint arXiv:1802.09841.
- Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059.
- Deep bayesian active learning with image data. In International Conference on Machine Learning, pages 1183–1192.
- Deep active learning over the long tail. arXiv preprint arXiv:1711.00941.
- Discriminative active learning. arXiv preprint arXiv:1907.06347.
- Multiwavelet-based operator learning for differential equations. Advances in neural information processing systems, 34:24048–24062.
- Harville, D. A. (1997). Matrix algebra from a statistician’s perspective. Springer Book Archive-Mathematics.
- Computer model calibration using high-dimensional output. Journal of the American Statistical Association, 103(482):570–583.
- Semi-supervised invertible deeponets for bayesian inverse problems. arXiv preprint arXiv:2209.02772.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
- Neural operator: Learning maps between function spaces with applications to pdes. J. Mach. Learn. Res., 24(89):1–97.
- Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, 30.
- Batch multi-fidelity active learning with budget constraints. Advances in Neural Information Processing Systems, 35:995–1007.
- Deep multi-fidelity active learning of high-dimensional outputs. In International Conference on Artificial Intelligence and Statistics, pages 1694–1711. PMLR.
- Infinite-fidelity coregionalization for physical simulation. Advances in Neural Information Processing Systems, 35:25965–25978.
- Multi-fidelity bayesian optimization via deep neural networks. Advances in Neural Information Processing Systems, 33:8521–8531.
- Scalable gaussian process regression networks. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pages 2456–2462.
- Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485.
- Multipole graph neural operator for parametric partial differential equations. Advances in Neural Information Processing Systems, 33:6755–6766.
- Fourier neural operator for parametric partial differential equations. In International Conference on Learning Representations.
- Deepoheat: Operator learning-based ultra-fast thermal simulation in 3d-ic design. arXiv preprint arXiv:2302.12949.
- A kernel approach for pde discovery and operator learning. arXiv preprint arXiv:2210.08140.
- Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229.
- A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Computer Methods in Applied Mechanics and Engineering, 393:114778.
- Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. arXiv preprint arXiv:2202.11214.
- Discovering and forecasting extreme events via active learning in neural operators. Nature Computational Science, 2(12):823–833.
- Active learning for convolutional neural networks: A core-set approach. In International Conference on Learning Representations.
- Multi-fidelity Bayesian optimization with max-value entropy search and its parallelization. In International Conference on Machine Learning, pages 9334–9345. PMLR.
- Attention is all you need. Advances in neural information processing systems, 30.
- Multi-fidelity high-order gaussian processes for physical simulation. In International Conference on Artificial Intelligence and Statistics, pages 847–855. PMLR.
- Bayesian learning via stochastic gradient langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11), pages 681–688.
- Deep coregionalization for the emulation of simulation-based spatial-temporal fields. Journal of Computational Physics, 428:109984.
- Residual gaussian process: A tractable nonparametric bayesian emulator for multi-fidelity simulations. Applied Mathematical Modelling, 97:36–56.
- Scalable high-order gaussian process regression. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2611–2620. PMLR.