Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization
Abstract: A long-standing belief holds that Bayesian Optimization (BO) with standard Gaussian processes (GP) -- referred to as standard BO -- underperforms in high-dimensional optimization problems. While this belief seems plausible, it lacks both robust empirical evidence and theoretical justification. To address this gap, we present a systematic investigation. First, through a comprehensive evaluation across twelve benchmarks, we found that while the popular Square Exponential (SE) kernel often leads to poor performance, using Mat\'ern kernels enables standard BO to consistently achieve top-tier results, frequently surpassing methods specifically designed for high-dimensional optimization. Second, our theoretical analysis reveals that the SE kernel's failure primarily stems from improper initialization of the length-scale parameters, which are commonly used in practice but can cause gradient vanishing in training. We provide a probabilistic bound to characterize this issue, showing that Mat\'ern kernels are less susceptible and can robustly handle much higher dimensions. Third, we propose a simple robust initialization strategy that dramatically improves the performance of the SE kernel, bringing it close to state-of-the-art methods, without requiring additional priors or regularization. We prove another probabilistic bound that demonstrates how the gradient vanishing issue can be effectively mitigated with our method. Our findings advocate for a re-evaluation of standard BO's potential in high-dimensional settings.
- Botorch: A framework for efficient monte-carlo bayesian optimization, 2020.
- Kernel methods are competitive for operator learning. Journal of Computational Physics, 496:112549, 2024.
- What is the state of neural network pruning?, 2020.
- Handling sparsity via the horseshoe. In Artificial intelligence and statistics, pages 73–80. PMLR, 2009.
- Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search, 2020.
- High-dimensional bayesian optimization with sparse axis-aligned subspaces, 2021.
- Scalable global optimization via local bayesian optimization, 2020.
- Peter I. Frazier. A tutorial on bayesian optimization, 2018.
- Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration, 2021.
- High-dimensional bayesian optimization via tree-structured additive models, 2020.
- Deep residual learning for image recognition, 2015.
- The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res., 15(1):1593–1623, 2014.
- Donald R Jones. Large-scale multi-disciplinary mass optimization in the auto industry. In MOPTA 2008 Conference (20 August 2008), volume 64, 2008.
- Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13:455–492, 1998. URL https://api.semanticscholar.org/CorpusID:263864014.
- High dimensional bayesian optimisation and bandits via additive models, 2016.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Ya Le and Xuan S. Yang. Tiny imagenet visual recognition challenge. 2015. URL https://api.semanticscholar.org/CorpusID:16664790.
- Re-examining linear embeddings for high-dimensional bayesian optimization, 2020.
- High dimensional bayesian optimization via restricted projection pursuit models. In Arthur Gretton and Christian C. Robert, editors, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, volume 51 of Proceedings of Machine Learning Research, pages 884–892, Cadiz, Spain, 09–11 May 2016. PMLR. URL https://proceedings.mlr.press/v51/li16e.html.
- Peter Mills. Accelerating kernel classifiers through borders mapping. Journal of Real-Time Image Processing, 17(2):313–327, 2020.
- Jonas Mockus. Bayesian approach to global optimization: theory and applications, volume 37. Springer Science & Business Media, 2012.
- High-dimensional bayesian optimization using low-dimensional feature spaces, 2020.
- A framework for Bayesian optimization in embedded subspaces. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 4752–4761. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/nayebi19a.html.
- High-dimensional bayesian optimization via additive models with overlapping groups, 2018.
- A tutorial on Thompson sampling. Foundations and Trends® in Machine Learning, 11(1):1–96, 2018.
- Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pages 2951–2959, 2012.
- S. Surjanovic and D. Bingham. Virtual library of simulation experiments: Test functions and datasets. Retrieved January 23, 2024, from http://www.sfu.ca/~ssurjano.
- Batched large-scale bayesian optimization in high-dimensional spaces, 2018.
- Bayesian optimization in a billion dimensions via random embeddings, 2016.
- Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006.
- Are random decompositions all we need in high dimensional bayesian optimisation?, 2023.
- Lassobench: A high-dimensional hyperparameter optimization benchmark suite for lasso, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.