Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pre-training helps Bayesian optimization too (2207.03084v1)

Published 7 Jul 2022 in cs.LG, cs.AI, and stat.ML

Abstract: Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully. Such domain knowledge often manifests in Gaussian process priors that specify initial beliefs on functions. However, even with expert knowledge, it is not an easy task to select a prior. This is especially true for hyperparameter tuning problems on complex machine learning models, where landscapes of tuning objectives are often difficult to comprehend. We seek an alternative practice for setting these functional priors. In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori. To verify our approach in realistic model training setups, we collected a large multi-task hyperparameter tuning dataset by training tens of thousands of configurations of near-state-of-the-art models on popular image and text datasets, as well as a protein sequence dataset. Our results show that on average, our method is able to locate good hyperparameters at least 3 times more efficiently than the best competing methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zi Wang (120 papers)
  2. George E. Dahl (27 papers)
  3. Kevin Swersky (51 papers)
  4. Chansoo Lee (18 papers)
  5. Zelda Mariet (15 papers)
  6. Zachary Nado (23 papers)
  7. Justin Gilmer (39 papers)
  8. Jasper Snoek (42 papers)
  9. Zoubin Ghahramani (108 papers)
Citations (9)