Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks (2404.03099v1)
Abstract: Operator learning is a rising field of scientific computing where inputs or outputs of a machine learning model are functions defined in infinite-dimensional spaces. In this paper, we introduce NEON (Neural Epistemic Operator Networks), an architecture for generating predictions with uncertainty using a single operator network backbone, which presents orders of magnitude less trainable parameters than deep ensembles of comparable performance. We showcase the utility of this method for sequential decision-making by examining the problem of composite Bayesian Optimization (BO), where we aim to optimize a function $f=g\circ h$, where $h:X\to C(\mathcal{Y},\mathbb{R}{d_s})$ is an unknown map which outputs elements of a function space, and $g: C(\mathcal{Y},\mathbb{R}{d_s})\to \mathbb{R}$ is a known and cheap-to-compute functional. By comparing our approach to other state-of-the-art methods on toy and real world scenarios, we demonstrate that NEON achieves state-of-the-art performance while requiring orders of magnitude less trainable parameters.
- Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, 2006.
- Bayesian neural networks: An introduction and survey. In Case Studies in Applied Bayesian Data Science, pages 45–87. Springer International Publishing, 2020.
- Simple and scalable predictive uncertainty estimation using deep ensembles, 2016.
- Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, mar 2021.
- Fourier neural operator for parametric partial differential equations, 2021.
- Neural operator: Learning maps between function spaces with applications to pdes. Journal of Machine Learning Research, 24(89):1–97, 2023.
- Learning operators with coupled attention. The Journal of Machine Learning Research, 23(1):9636–9698, 2022.
- Learning the solution operator of parametric partial differential equations with physics-informed deeponets. Science advances, 7(40):eabi8605, 2021.
- Improved architectures and training algorithms for deep operator networks. Journal of Scientific Computing, 92(2):35, 2022.
- Scalable uncertainty quantification for deep operator networks using randomized priors. Computer Methods in Applied Mechanics and Engineering, 399:115399, 2022.
- Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons. Journal of Computational Physics, 477:111902, 2023.
- A systematic comparison of bayesian deep learning robustness in diabetic retinopathy tasks, 2019.
- Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639):115–118, January 2017.
- Yu Huang and Yue Chen. Autonomous driving with deep learning: A survey of state-of-art technologies, 2020.
- Bayesian active learning for classification and preference learning, 2011.
- Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019.
- Epistemic neural networks. CoRR, abs/2107.08924, 2021.
- Recent advances in bayesian optimization, 2022.
- Botorch: Programmable bayesian optimization in pytorch. CoRR, abs/1910.06403, 2019.
- Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 2016.
- Nomad: Nonlinear manifold decoders for operator learning, 2022.
- Scalable bayesian optimization with randomized prior networks. Computer Methods in Applied Mechanics and Engineering, 417:116428, 2023.
- Bayesian optimization with high-dimensional outputs. Advances in neural information processing systems, 34:19274–19287, 2021.
- Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Transactions on Neural Networks, 6(4):911–917, 1995.
- Neural operator prediction of linear instability waves in high-speed boundary layers. Journal of Computational Physics, 474:111793, 2023.
- Mionet: Learning multiple-input operators via tensor product, 2022.
- Bayesian optimization of composite functions, 2019.
- Joint composite latent space bayesian optimization. arXiv preprint arXiv:2311.02213, 2023.
- Optimizing coverage and capacity in cellular networks using machine learning. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8138–8142. IEEE, 2021.
- Deep learning for bayesian optimization of scientific problems with high-dimensional structure. Transactions on Machine Learning Research, 2022.
- Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS, 2020.
- Attention beats concatenation for conditioning neural fields, 2022.
- Understanding the exploding gradient problem. CoRR, abs/1211.5063, 2012.
- Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30-1, page 3. Atlanta, GA, 2013.
- Unexpected improvements to expected improvement for bayesian optimization. Advances in Neural Information Processing Systems, 36, 2024.
- Gaussian process bandits without regret: An experimental design approach. CoRR, abs/0912.3995, 2009.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020.
- On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1):503–528, 1989.
- Kriging is well-suited to parallelize optimization. In Computational intelligence in expensive optimization problems, pages 131–162. Springer, 2010.
- Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization. Advances in Neural Information Processing Systems, 33:9851–9864, 2020.
- Parallel bayesian global optimization of expensive functions. arXiv preprint arXiv:1602.05149, 2016.
- Bayesian calibration and uncertainty analysis for computationally expensive models using optimization and radial basis function approximation. Journal of Computational and Graphical Statistics, 17(2):270–294, 2008.
- David Zwicker. py-pde: A python package for solving partial differential equations. Journal of Open Source Software, 5(48):2158, 2020.
- Interferobot: aligning an optical interferometer by a reinforcement learning agent, 2021.
- JAX: composable transformations of Python+NumPy programs, 2018.
- Flax: A neural network library and ecosystem for JAX, 2023.
- J. D. Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007.
- Array programming with NumPy. Nature, 585(7825):357–362, September 2020.
- Adam: A method for stochastic optimization, 2017.
- Maximizing acquisition functions for bayesian optimization, 2018.