Unexpected Improvements to Expected Improvement for Bayesian Optimization (2310.20708v2)
Abstract: Expected Improvement (EI) is arguably the most popular acquisition function in Bayesian optimization and has found countless successful applications, but its performance is often exceeded by that of more recent methods. Notably, EI and its variants, including for the parallel and multi-objective settings, are challenging to optimize because their acquisition values vanish numerically in many regions. This difficulty generally increases as the number of observations, dimensionality of the search space, or the number of constraints grow, resulting in performance that is inconsistent across the literature and most often sub-optimal. Herein, we propose LogEI, a new family of acquisition functions whose members either have identical or approximately equal optima as their canonical counterparts, but are substantially easier to optimize numerically. We demonstrate that numerical pathologies manifest themselves in "classic" analytic EI, Expected Hypervolume Improvement (EHVI), as well as their constrained, noisy, and parallel variants, and propose corresponding reformulations that remedy these pathologies. Our empirical results show that members of the LogEI family of acquisition functions substantially improve on the optimization performance of their canonical counterparts and surprisingly, are on par with or exceed the performance of recent state-of-the-art acquisition functions, highlighting the understated role of numerical optimization in the literature.
- TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. URL https://www.tensorflow.org/.
- Accurate and efficient numerical calculation of stable densities via optimized quadrature and asymptotics. Statistics and Computing, 28:171–185, 2018.
- Autonomous materials synthesis via hierarchical active learning of nonequilibrium phase diagrams. Science Advances, 7(51):eabg4930, 2021. doi: 10.1126/sciadv.abg4930. URL https://www.science.org/doi/abs/10.1126/sciadv.abg4930.
- Sustainable concrete via bayesian optimization, 2023. URL https://arxiv.org/abs/2310.18288. NeurIPS 2023 Workshop on Adaptive Experimentation in the Real World.
- Scalable first-order Bayesian optimization via structured automatic differentiation. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 500–516. PMLR, 17–23 Jul 2022. URL https://proceedings.mlr.press/v162/ament22a.html.
- BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. In Advances in Neural Information Processing Systems 33, 2020.
- Bayesian optimization of combinatorial structures, 2018.
- Max-value entropy search for multi-objective bayesian optimization with constraints, 2020.
- JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax.
- A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5):1190–1208, 1995.
- Carlos A Coello Coello and Efrén Mezura Montes. Constraint-handling in genetic algorithms through the use of dominance-based tournament selection. Advanced Engineering Informatics, 16(3):193–203, 2002.
- Hebo pushing the limits of sample-efficient hyperparameter optimisation, 2022.
- Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 9851–9864. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/file/6fec24eac8f18ed793f5eaad3dd7977c-Paper.pdf.
- Robust multi-objective Bayesian optimization under input noise. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 4831–4866. PMLR, 17–23 Jul 2022a. URL https://proceedings.mlr.press/v162/daulton22a.html.
- Bayesian optimization over discrete and mixed spaces via probabilistic reparameterization. In Advances in Neural Information Processing Systems 35, 2022b.
- High-dimensional gaussian process inference with derivatives. In International Conference on Machine Learning, pages 2535–2545. PMLR, 2021.
- Scalable multi-objective optimization test problems. volume 1, pages 825–830, 06 2002. ISBN 0-7803-7282-4. doi: 10.1109/CEC.2002.1007032.
- Bayesian optimization over high-dimensional combinatorial spaces via dictionary-based embeddings. In Francisco Ruiz, Jennifer Dy, and Jan-Willem van de Meent, editors, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, volume 206 of Proceedings of Machine Learning Research, pages 7021–7039. PMLR, 25–27 Apr 2023.
- Optimizing coverage and capacity in cellular networks using machine learning, 2021.
- Single- and multiobjective evolutionary optimization assisted by gaussian random field metamodels. IEEE Transactions on Evolutionary Computation, 10(4):421–439, 2006.
- High-dimensional Bayesian optimization with sparse axis-aligned subspaces. In Uncertainty in Artificial Intelligence. PMLR, 2021.
- Scalable constrained Bayesian optimization. In International Conference on Artificial Intelligence and Statistics. PMLR, 2021.
- Scaling gaussian process regression with derivatives. Advances in neural information processing systems, 31, 2018.
- Scalable global optimization via local Bayesian optimization. In Advances in Neural Information Processing Systems 32, NeurIPS, 2019.
- Peter I Frazier. A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811, 2018.
- The logarithmic hypervolume indicator. In Proceedings of the 11th workshop proceedings on Foundations of genetic algorithms, pages 81–92, 2011.
- Bayesian optimization with inequality constraints. In Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, pages 937–945, Beijing, China, 22–24 Jun 2014. PMLR.
- Roman Garnett. Bayesian Optimization. Cambridge University Press, 2023. to appear.
- Bayesian optimization with unknown constraints. In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, UAI, 2014.
- A Multi-points Criterion for Deterministic Parallel Global Optimization based on Gaussian Processes. Technical report, March 2008. URL https://hal.science/hal-00260579.
- Triangulation candidates for bayesian optimization. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 35933–35945. Curran Associates, Inc., 2022.
- Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (cma-es). Evolutionary Computation, 11(1):1–18, 2003. doi: 10.1162/106365603321828970.
- scikit-optimize/scikit-optimize, October 2021. URL https://doi.org/10.5281/zenodo.5565057.
- Predictive entropy search for efficient global optimization of black-box functions. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1, NIPS’14, pages 918–926, Cambridge, MA, USA, 2014. MIT Press.
- Sequential model-based optimization for general algorithm configuration. In Learning and Intelligent Optimization: 5th International Conference, LION 5, Rome, Italy, January 17-21, 2011. Selected Papers 5, pages 507–523. Springer, 2011.
- Joint entropy search for maximally-informed bayesian optimization. In Advances in Neural Information Processing Systems 35, 2022.
- Wolfram Research, Inc. Wolfram alpha, 2023. URL https://www.wolframalpha.com/.
- Multi-objective and multi-fidelity bayesian optimization of laser-plasma acceleration. Phys. Rev. Res., 5:013063, Jan 2023a. doi: 10.1103/PhysRevResearch.5.013063. URL https://link.aps.org/doi/10.1103/PhysRevResearch.5.013063.
- Reference dataset of multi-objective and multi- fidelity optimization in laser-plasma acceleration, January 2023b. URL https://doi.org/10.5281/zenodo.7565882.
- Efficient nonmyopic bayesian optimization via one-shot multi-step trees. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 18039–18049. Curran Associates, Inc., 2020.
- Lipschitzian optimisation without the lipschitz constant. Journal of Optimization Theory and Applications, 79:157–181, 01 1993. doi: 10.1007/BF00941892.
- Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13:455–492, 1998.
- Tuning hyperparameters without grad students: Scalable and robust bayesian optimisation with dragonfly. J. Mach. Learn. Res., 21(1), jan 2020.
- Combinatorial bayesian optimization with random mapping functions to convex polytopes. In James Cussens and Kun Zhang, editors, Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, volume 180 of Proceedings of Machine Learning Research, pages 1001–1011. PMLR, 01–05 Aug 2022.
- Auto-Encoding Variational Bayes. arXiv e-prints, page arXiv:1312.6114, Dec 2013.
- Robo: A flexible and robust bayesian optimization framework in python. In NIPS 2017 Bayesian Optimization Workshop, December 2017.
- Dieter Kraft. A software package for sequential quadratic programming. Forschungsbericht- Deutsche Forschungs- und Versuchsanstalt fur Luft- und Raumfahrt, 1988.
- Advances in bayesian optimization with applications in aerospace engineering. In 2018 AIAA Non-Deterministic Approaches Conference, page 1656, 2018.
- Designing materials for biology and medicine. Nature, 428, 04 2004.
- Posterior variance analysis of gaussian processes with application to average learning curves. arXiv preprint arXiv:1906.01404, 2019.
- Cost-aware Bayesian Optimization. arXiv e-prints, page arXiv:2003.10870, March 2020.
- Constrained bayesian optimization with noisy experiments. Bayesian Analysis, 14(2):495–519, 06 2019. doi: 10.1214/18-BA1110.
- Benchmarking the performance of bayesian optimization across multiple experimental materials science domains. npj Computational Materials, 7(1):188, 2021.
- Multiobjective optimization for crash safety design of vehicles using stepwise regression model. Structural and Multidisciplinary Optimization, 35:561–569, 06 2008. doi: 10.1007/s00158-007-0163-x.
- Batch Bayesian optimization via multi-objective acquisition ensemble for automated analog circuit design. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 3306–3314. PMLR, 10–15 Jul 2018. URL https://proceedings.mlr.press/v80/lyu18a.html.
- Martin Mächler. Accurately computing log (1- exp (-| a|)) assessed by the rmpfr package. Technical report, Technical report, 2012.
- Optimization, fast and slow: optimally switching between local and Bayesian optimization. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 3443–3452. PMLR, 10–15 Jul 2018.
- Jonas Močkus. On bayesian methods for seeking the extremum. In Optimization Techniques IFIP Technical Conference: Novosibirsk, July 1–7, 1974, pages 400–404. Springer, 1975.
- Jonas Mockus. The application of bayesian methods for seeking the extremum. Towards global optimization, 2:117–129, 1978.
- Making ego and cma-es complementary for global optimization. In Clarisse Dhaenens, Laetitia Jourdan, and Marie-Eléonore Marmion, editors, Learning and Intelligent Optimization, pages 287–292, Cham, 2015. Springer International Publishing.
- Gibbon: General-purpose information-based bayesian optimisation. J. Mach. Learn. Res., 22(1), jan 2021.
- Combinatorial bayesian optimization using the graph cartesian product. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 2914–2924. Curran Associates, Inc., 2019.
- Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Trieste: Efficiently exploring the depths of black-box functions with tensorflow, 2023. URL https://arxiv.org/abs/2302.08436.
- Carl Edward Rasmussen. Gaussian Processes in Machine Learning, pages 63–71. Springer Berlin Heidelberg, Berlin, Heidelberg, 2004.
- Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pages 2951–2959, 2012.
- S. Surjanovic and D. Bingham. Virtual library of simulation experiments: Test functions and datasets. Retrieved May 14, 2023, from http://www.sfu.ca/~ssurjano.
- An easy-to-use real-world multi-objective optimization problem suite. Applied Soft Computing, 89:106078, 2020. ISSN 1568-4946.
- The GPyOpt authors. GPyOpt: A bayesian optimization framework in python. http://github.com/SheffieldML/GPyOpt, 2016.
- Global optimization, volume 350. Springer, 1989.
- Joint entropy search for multi-objective bayesian optimization. In Advances in Neural Information Processing Systems 35, 2022.
- Bayesian optimization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimization challenge 2020. In NeurIPS 2020 Competition and Demonstration Track, 2021.
- On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Mathematical Programming, 106(1):25–57, 2006.
- Think global and act local: Bayesian optimisation over high-dimensional categorical and mixed search spaces. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139, pages 10663–10674. PMLR, 18–24 Jul 2021.
- Parallel bayesian global optimization of expensive functions, 2016.
- Zi Wang and Stefanie Jegelka. Max-value Entropy Search for Efficient Bayesian Optimization. ArXiv e-prints, page arXiv:1703.01968, March 2017.
- Maximizing acquisition functions for bayesian optimization. In Advances in Neural Information Processing Systems 31, pages 9905–9916. 2018.
- The parallel knowledge gradient method for batch bayesian optimization. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 3134–3142. Curran Associates Inc., 2016.
- Bayesian optimization with gradients. In Advances in Neural Information Processing Systems, pages 5267–5278, 2017.
- Multi-objective bayesian global optimization using expected hypervolume improvement gradient. Swarm and Evolutionary Computation, 44:945 – 956, 2019.
- Expected improvement for expensive optimization: a review. Journal of Global Optimization, 78(3):507–544, 2020.
- Enhancing high-dimensional bayesian optimization by optimizing the acquisition function maximizer initialization. arXiv preprint arXiv:2302.08298, 2023.
- Comparison of multiobjective evolutionary algorithms: Empirical results. Evol. Comput., 8(2):173–195, jun 2000. ISSN 1063-6560. doi: 10.1162/106365600568202. URL https://doi.org/10.1162/106365600568202.
- Sebastian Ament (19 papers)
- Samuel Daulton (14 papers)
- David Eriksson (22 papers)
- Maximilian Balandat (27 papers)
- Eytan Bakshy (38 papers)