Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Computationally Efficient High-Dimensional Bayesian Optimization via Variable Selection (2109.09264v2)

Published 20 Sep 2021 in cs.LG and stat.ML

Abstract: Bayesian Optimization (BO) is a method for globally optimizing black-box functions. While BO has been successfully applied to many scenarios, developing effective BO algorithms that scale to functions with high-dimensional domains is still a challenge. Optimizing such functions by vanilla BO is extremely time-consuming. Alternative strategies for high-dimensional BO that are based on the idea of embedding the high-dimensional space to the one with low dimension are sensitive to the choice of the embedding dimension, which needs to be pre-specified. We develop a new computationally efficient high-dimensional BO method that exploits variable selection. Our method is able to automatically learn axis-aligned sub-spaces, i.e. spaces containing selected variables, without the demand of any pre-specified hyperparameters. We theoretically analyze the computational complexity of our algorithm and derive the regret bound. We empirically show the efficacy of our method on several synthetic and real problems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Peter Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3(Nov):397–422, 2002.
  2. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In International conference on machine learning, pages 115–123, 2013.
  3. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics. arXiv preprint arXiv:1602.04450, 2016.
  4. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599, 2010.
  5. Bayesian optimization for learning gaits under uncertainty. Annals of Mathematics and Artificial Intelligence, 76(1):5–23, 2016.
  6. Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. British Journal of Mathematical and Statistical Psychology, 45(2):265–282, 1992.
  7. High-dimensional Gaussian process bandits. In Advances in Neural Information Processing Systems, pages 1025–1033, 2013.
  8. High-Dimensional Bayesian Optimization with Sparse Axis-Aligned Subspaces. arXiv preprint arXiv:2103.00349, 2021.
  9. Peter I Frazier. A tutorial on Bayesian optimization. arXiv preprint arXiv:1807.02811, 2018.
  10. Bayesian optimization for synthetic gene design. arXiv preprint arXiv:1505.01627, 2015.
  11. Constrained Bayesian optimization for automatic chemical design. arXiv preprint arXiv:1709.05501, 2017.
  12. Nikolaus Hansen. The CMA evolution strategy: A tutorial. arXiv preprint arXiv:1604.00772, 2016.
  13. Automated configuration of mixed integer programming solvers. In International Conference on Integration of Artificial Intelligence (AI) and Operations Research (OR) Techniques in Constraint Programming, pages 186–202. Springer, 2010.
  14. An efficient approach for assessing hyperparameter importance. In International Conference on Machine Learning, pages 754–762. PMLR, 2014.
  15. Donald R Jones. Large-scale multi-disciplinary mass optimization in the auto industry. In MOPTA 2008 Conference (20 August 2008), 2008.
  16. High dimensional Bayesian optimisation and bandits via additive models. In International Conference on Machine Learning, pages 295–304, 2015.
  17. Tuning hyperparameters without grad students: Scalable and robust bayesian optimisation with dragonfly. Journal of Machine Learning Research, 21(81):1–27, 2020.
  18. Fast Bayesian optimization of machine learning hyperparameters on large datasets. In Artificial Intelligence and Statistics, pages 528–536. PMLR, 2017.
  19. Re-examining linear embeddings for high-dimensional Bayesian optimization. Advances in Neural Information Processing Systems, 33, 2020.
  20. High dimensional Bayesian optimization via restricted projection pursuit models. In Artificial Intelligence and Statistics, pages 884–892, 2016.
  21. Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods. arXiv preprint arXiv:1712.09677, 2017.
  22. Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 1557–1563. IEEE, 2017.
  23. Jan Hendrik Metzen. Minimum regret search for single- and multi-task optimization. arXiv preprint arXiv:1602.01064, 2016.
  24. Jonas Močkus. On Bayesian methods for seeking the extremum. In Optimization Techniques IFIP Technical Conference, pages 400–404. Springer, 1975.
  25. High-dimensional Bayesian optimization using low-dimensional feature spaces. arXiv preprint arXiv:1902.10675, 2019.
  26. A framework for Bayesian optimization in embedded subspaces. In International Conference on Machine Learning, pages 4752–4761. PMLR, 2019.
  27. The knowledge-gradient algorithm for sequencing experiments in drug discovery. INFORMS Journal on Computing, 23(3):346–363, 2011.
  28. Automated machine learning on big data using stochastic algorithm tuning. arXiv preprint arXiv:1407.7969, 2014.
  29. Variable selection for gaussian processes via sensitivity analysis of the posterior predictive distribution. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1743–1752, 2019.
  30. High-dimensional Bayesian optimization via additive models with overlapping groups. arXiv preprint arXiv:1802.07028, 2018.
  31. Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems, pages 2951–2959, 2012.
  32. Bayesian optimization in effective dimensions via kernel-based sensitivity indices. In International Conference on Applications of Statistics and Probability in Civil Engineering, 2019.
  33. Gaussian process optimization in the bandit setting: No regret and experimental design. arXiv preprint arXiv:0912.3995, 2009.
  34. Gilbert W Stewart. The efficient generation of random orthogonal matrices with an application to condition estimators. SIAM Journal on Numerical Analysis, 17(3):403–409, 1980.
  35. Bayesian optimization with dimension scheduling: Application to biological systems. In Computer Aided Chemical Engineering, volume 38, pages 1051–1056. Elsevier, 2016.
  36. Batched high-dimensional bayesian optimization via structural kernel learning. In International Conference on Machine Learning, pages 3656–3664. PMLR, 2017.
  37. Bayesian optimization in a billion dimensions via random embeddings. Journal of Artificial Intelligence Research, 55:361–387, 2016.
  38. Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006.
  39. Using trajectory data to improve Bayesian optimization for reinforcement learning. The Journal of Machine Learning Research, 15(1):253–282, 2014.
  40. Taking human out of learning applications: A survey on automated machine learning. arXiv preprint arXiv:1810.13306, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Yihang Shen (2 papers)
  2. Carl Kingsford (10 papers)
Citations (3)