Scalable Global Optimization via Local Bayesian Optimization (1910.01739v4)

Published 3 Oct 2019 in cs.LG and stat.ML

Abstract: Bayesian optimization has recently emerged as a popular method for the sample-efficient optimization of expensive black-box functions. However, the application to high-dimensional problems with several thousand observations remains challenging, and on difficult problems Bayesian optimization is often not competitive with other paradigms. In this paper we take the view that this is due to the implicit homogeneity of the global probabilistic models and an overemphasized exploration that results from global acquisition. This motivates the design of a local probabilistic approach for global optimization of large-scale high-dimensional problems. We propose the $\texttt{TuRBO}$ algorithm that fits a collection of local models and performs a principled global allocation of samples across these models via an implicit bandit approach. A comprehensive evaluation demonstrates that $\texttt{TuRBO}$ outperforms state-of-the-art methods from machine learning and operations research on problems spanning reinforcement learning, robotics, and the natural sciences.

PDF Abstract

Scalable Global Optimization via Local Bayesian Optimization

The paper "Scalable Global Optimization via Local Bayesian Optimization" presents an exploration of a novel approach to optimize high-dimensional black-box functions efficiently through a method known as Trust Region Bayesian Optimization (TuRBO). The challenge that the authors address is the sample-efficient optimization of expensive black-box functions, which becomes particularly pronounced in high-dimensional spaces. Traditional Bayesian Optimization (BO) strategies struggle to maintain competitiveness against other optimization paradigms when dealing with several thousand observations due to the implicit homogeneity and overemphasized exploration in global probabilistic models. This paper introduces a method that leverages local models instead of global ones to handle this issue.

Summary

In tackling the problem of high-dimensional optimization, the authors propose the TuRBO algorithm. TuRBO innovatively incorporates a collection of local models, fitting these models individually and employing an implicit bandit strategy to allocate samples globally among these local models. Unlike global models, local models have the advantage of not suffering from over-exploration and can more accurately capture the heterogeneous nature of complex functions.

Key Features of TuRBO:

Local Surrogate Modelling: Instead of a global surrogate, TuRBO employs several local Gaussian Process (GP) models, each associated with a trust region. This allows for more accurate function modeling within a specified neighborhood.
Trust Regions: Each local model operates within a trust region, whose size is dynamically adjusted based on the success of the optimization process — expanding with successful explorations and contracting otherwise.
Implicit Bandit Strategy: TuRBO utilizes a multi-armed bandit framework to decide which trust region to sample from, making it a strategic choice in deciding between exploration and exploitation.
Thompson Sampling: The acquisition function is based on Thompson sampling, which efficiently handles batch selection and offers an intuitive framework for the exploration-exploitation trade-off.

Experimental Evaluation

The algorithm was benchmarked against state-of-the-art optimization methods on several complex real-world tasks, including robotics, reinforcement learning, and natural sciences. Some of the key findings from these comparisons include:

Superior Performance: TuRBO consistently outperformed other Bayesian algorithms as well as approaches from operations research such as CMA-ES and BOBYQA in terms of finding better solutions quickly.
Robustness Across Domains: The algorithm showed robust performance across multiple high-dimensional problems including robot pushing tasks, rover trajectory planning, cosmological constant learning, and lunar landing reinforcement learning tasks.
Efficiency: Despite the introduction of multiple local models, the overhead for using TuRBO is relatively efficient, leveraging scalable Gaussian Process regression and batch evaluation settings.

Implications and Future Directions

The implications of this research are significant for the field of optimization, particularly in applications that involve high-dimensional and expensive evaluations such as automated hyperparameter tuning in machine learning models and design optimization in engineering. The local approach adopted by TuRBO could potentially make it a competitive choice for industrial applications where both computational resources and accuracy are crucial.

Future developments may include extending the TuRBO framework to incorporate derivative information, which could be beneficial for areas such as engineering simulations where such data is accessible. Moreover, integration with techniques for learning local low-dimensional structures could further enhance the accuracy and computational efficiency of the models. Lastly, its potential application in dynamic environments, where the optimization landscape evolves over time, presents an intriguing avenue of research.

Overall, the work presents a significant step towards addressing the scalability issues in Bayesian Optimization for high-dimensional optimization problems. By shifting the focus from global to local probabilistic models, the authors contribute an approach that opens new pathways for scalable and efficient optimization in complex domains.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

David Eriksson (22 papers)
Michael Pearce (13 papers)
Jacob R Gardner (52 papers)
Ryan Turner (11 papers)
Matthias Poloczek (17 papers)

Citations (402)

View on Semantic Scholar

Scalable Global Optimization via Local Bayesian Optimization (1910.01739v4)