Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks (2404.03099v1)

Published 3 Apr 2024 in cs.LG, cs.AI, cs.CE, cs.IT, math.IT, and stat.ML

Abstract: Operator learning is a rising field of scientific computing where inputs or outputs of a machine learning model are functions defined in infinite-dimensional spaces. In this paper, we introduce NEON (Neural Epistemic Operator Networks), an architecture for generating predictions with uncertainty using a single operator network backbone, which presents orders of magnitude less trainable parameters than deep ensembles of comparable performance. We showcase the utility of this method for sequential decision-making by examining the problem of composite Bayesian Optimization (BO), where we aim to optimize a function $f=g\circ h$, where $h:X\to C(\mathcal{Y},\mathbb{R}{d_s})$ is an unknown map which outputs elements of a function space, and $g: C(\mathcal{Y},\mathbb{R}{d_s})\to \mathbb{R}$ is a known and cheap-to-compute functional. By comparing our approach to other state-of-the-art methods on toy and real world scenarios, we demonstrate that NEON achieves state-of-the-art performance while requiring orders of magnitude less trainable parameters.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, 2006.
  2. Bayesian neural networks: An introduction and survey. In Case Studies in Applied Bayesian Data Science, pages 45–87. Springer International Publishing, 2020.
  3. Simple and scalable predictive uncertainty estimation using deep ensembles, 2016.
  4. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, mar 2021.
  5. Fourier neural operator for parametric partial differential equations, 2021.
  6. Neural operator: Learning maps between function spaces with applications to pdes. Journal of Machine Learning Research, 24(89):1–97, 2023.
  7. Learning operators with coupled attention. The Journal of Machine Learning Research, 23(1):9636–9698, 2022.
  8. Learning the solution operator of parametric partial differential equations with physics-informed deeponets. Science advances, 7(40):eabi8605, 2021.
  9. Improved architectures and training algorithms for deep operator networks. Journal of Scientific Computing, 92(2):35, 2022.
  10. Scalable uncertainty quantification for deep operator networks using randomized priors. Computer Methods in Applied Mechanics and Engineering, 399:115399, 2022.
  11. Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons. Journal of Computational Physics, 477:111902, 2023.
  12. A systematic comparison of bayesian deep learning robustness in diabetic retinopathy tasks, 2019.
  13. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639):115–118, January 2017.
  14. Yu Huang and Yue Chen. Autonomous driving with deep learning: A survey of state-of-art technologies, 2020.
  15. Bayesian active learning for classification and preference learning, 2011.
  16. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019.
  17. Epistemic neural networks. CoRR, abs/2107.08924, 2021.
  18. Recent advances in bayesian optimization, 2022.
  19. Botorch: Programmable bayesian optimization in pytorch. CoRR, abs/1910.06403, 2019.
  20. Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 2016.
  21. Nomad: Nonlinear manifold decoders for operator learning, 2022.
  22. Scalable bayesian optimization with randomized prior networks. Computer Methods in Applied Mechanics and Engineering, 417:116428, 2023.
  23. Bayesian optimization with high-dimensional outputs. Advances in neural information processing systems, 34:19274–19287, 2021.
  24. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Transactions on Neural Networks, 6(4):911–917, 1995.
  25. Neural operator prediction of linear instability waves in high-speed boundary layers. Journal of Computational Physics, 474:111793, 2023.
  26. Mionet: Learning multiple-input operators via tensor product, 2022.
  27. Bayesian optimization of composite functions, 2019.
  28. Joint composite latent space bayesian optimization. arXiv preprint arXiv:2311.02213, 2023.
  29. Optimizing coverage and capacity in cellular networks using machine learning. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8138–8142. IEEE, 2021.
  30. Deep learning for bayesian optimization of scientific problems with high-dimensional structure. Transactions on Machine Learning Research, 2022.
  31. Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS, 2020.
  32. Attention beats concatenation for conditioning neural fields, 2022.
  33. Understanding the exploding gradient problem. CoRR, abs/1211.5063, 2012.
  34. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30-1, page 3. Atlanta, GA, 2013.
  35. Unexpected improvements to expected improvement for bayesian optimization. Advances in Neural Information Processing Systems, 36, 2024.
  36. Gaussian process bandits without regret: An experimental design approach. CoRR, abs/0912.3995, 2009.
  37. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020.
  38. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1):503–528, 1989.
  39. Kriging is well-suited to parallelize optimization. In Computational intelligence in expensive optimization problems, pages 131–162. Springer, 2010.
  40. Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization. Advances in Neural Information Processing Systems, 33:9851–9864, 2020.
  41. Parallel bayesian global optimization of expensive functions. arXiv preprint arXiv:1602.05149, 2016.
  42. Bayesian calibration and uncertainty analysis for computationally expensive models using optimization and radial basis function approximation. Journal of Computational and Graphical Statistics, 17(2):270–294, 2008.
  43. David Zwicker. py-pde: A python package for solving partial differential equations. Journal of Open Source Software, 5(48):2158, 2020.
  44. Interferobot: aligning an optical interferometer by a reinforcement learning agent, 2021.
  45. JAX: composable transformations of Python+NumPy programs, 2018.
  46. Flax: A neural network library and ecosystem for JAX, 2023.
  47. J. D. Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007.
  48. Array programming with NumPy. Nature, 585(7825):357–362, September 2020.
  49. Adam: A method for stochastic optimization, 2017.
  50. Maximizing acquisition functions for bayesian optimization, 2018.

Summary

We haven't generated a summary for this paper yet.