Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FlexHB: a More Efficient and Flexible Framework for Hyperparameter Optimization (2402.13641v1)

Published 21 Feb 2024 in cs.LG

Abstract: Given a Hyperparameter Optimization(HPO) problem, how to design an algorithm to find optimal configurations efficiently? Bayesian Optimization(BO) and the multi-fidelity BO methods employ surrogate models to sample configurations based on history evaluations. More recent studies obtain better performance by integrating BO with HyperBand(HB), which accelerates evaluation by early stopping mechanism. However, these methods ignore the advantage of a suitable evaluation scheme over the default HyperBand, and the capability of BO is still constrained by skewed evaluation results. In this paper, we propose FlexHB, a new method pushing multi-fidelity BO to the limit as well as re-designing a framework for early stopping with Successive Halving(SH). Comprehensive study on FlexHB shows that (1) our fine-grained fidelity method considerably enhances the efficiency of searching optimal configurations, (2) our FlexBand framework (self-adaptive allocation of SH brackets, and global ranking of configurations in both current and past SH procedures) grants the algorithm with more flexibility and improves the anytime performance. Our method achieves superior efficiency and outperforms other methods on various HPO tasks. Empirical results demonstrate that FlexHB can achieve up to 6.9X and 11.1X speedups over the state-of-the-art MFES-HB and BOHB respectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Abdi, H. 2007. The Kendall rank correlation coefficient. Encyclopedia of measurement and statistics, 2: 508–510.
  2. DEHB: Evolutionary Hyberband for Scalable, Robust and Efficient Hyperparameter Optimization. In Zhou, Z., ed., Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, 2147–2153. ijcai.org.
  3. Practical neural network performance prediction for early stopping. arXiv preprint arXiv:1705.10823, 2(3): 6.
  4. Algorithms for hyper-parameter optimization. Advances in neural information processing systems, 24.
  5. Random search for hyper-parameter optimization. Journal of machine learning research, 13(2).
  6. Generalized product of experts for automatic and principled fusion of Gaussian process predictions. arXiv preprint arXiv:1410.7827.
  7. Hebo: Pushing the limits of sample-efficient hyper-parameter optimisation. Journal of Artificial Intelligence Research, 74: 1269–1349.
  8. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In Twenty-fourth international joint conference on artificial intelligence.
  9. UCI Machine Learning Repository.
  10. Towards an empirical foundation for assessing bayesian optimization of hyperparameters. In NIPS workshop on Bayesian Optimization in Theory and Practice, volume 10.
  11. Scalable global optimization via local bayesian optimization. Advances in neural information processing systems, 32.
  12. BOHB: Robust and Efficient Hyperparameter Optimization at Scale. In ICML.
  13. Frazier, P. I. 2018. A tutorial on Bayesian optimization. arXiv preprint arXiv:1807.02811.
  14. Google vizier: A service for black-box optimization. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, 1487–1495.
  15. Batch Bayesian optimization via local penalization. In Artificial intelligence and statistics, 648–657. PMLR.
  16. Deep learning. MIT press.
  17. Hansen, N. 2016. The CMA evolution strategy: A tutorial. arXiv preprint arXiv:1604.00772.
  18. AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, 212: 106622.
  19. Multi-fidelity automatic hyper-parameter tuning via transfer series expansion. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 3846–3853.
  20. An asymptotically optimal multi-armed bandit algorithm and hyperparameter optimization. arXiv preprint arXiv:2007.05670.
  21. Sequential model-based optimization for general algorithm configuration. In International conference on learning and intelligent optimization, 507–523. Springer.
  22. Automated machine learning: methods, systems, challenges. Springer Nature.
  23. Population based training of neural networks. arXiv preprint arXiv:1711.09846.
  24. Non-stochastic Best Arm Identification and Hyperparameter Optimization. In Gretton, A.; and Robert, C. C., eds., Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, volume 51 of Proceedings of Machine Learning Research, 240–248. Cadiz, Spain: PMLR.
  25. Multi-fidelity bayesian optimisation with continuous approximations. In International Conference on Machine Learning, 1799–1808. PMLR.
  26. Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly. J. Mach. Learn. Res., 21(81): 1–27.
  27. Almost optimal exploration in multi-armed bandits. In International Conference on Machine Learning, 1238–1246. PMLR.
  28. Fast bayesian optimization of machine learning hyperparameters on large datasets. In Artificial intelligence and statistics, 528–536. PMLR.
  29. Model-based asynchronous hyperparameter and neural architecture search. arXiv preprint arXiv:2003.10865.
  30. Hyperband: A novel bandit-based approach to hyperparameter optimization. The Journal of Machine Learning Research, 18(1): 6765–6816.
  31. Massively parallel hyperparameter tuning. arXiv preprint arXiv:1810.05934, 5.
  32. Efficient hyperparameter optimization and infinitely many armed bandits. CoRR, abs/1603.06560, 16.
  33. MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 8491–8500.
  34. Tune: A Research Platform for Distributed Model Selection and Training. arXiv preprint arXiv:1807.05118.
  35. SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization. Journal of Machine Learning Research, 23(54): 1–9.
  36. An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2286–2300.
  37. Microsoft. 2021. Neural Network Intelligence.
  38. TPOT: A tree-based pipeline optimization tool for automating machine learning. In Workshop on automatic machine learning, 66–74. PMLR.
  39. Automated reinforcement learning (autorl): A survey and open problems. Journal of Artificial Intelligence Research, 74: 517–568.
  40. Multi-information source optimization. Advances in neural information processing systems, 30.
  41. Noisy Blackbox Optimization using Multi-fidelity Queries: A Tree Search Approach. In Chaudhuri, K.; and Sugiyama, M., eds., Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, 2096–2105. PMLR.
  42. Mastering the game of go without human knowledge. nature, 550(7676): 354–359.
  43. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25.
  44. Bayesian optimization with robust Bayesian neural networks. Advances in neural information processing systems, 29.
  45. A simple transfer-learning extension of Hyperband. In NIPS Workshop on Meta-Learning.
  46. Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 10293–10301.
  47. Maximizing acquisition functions for Bayesian optimization. Advances in neural information processing systems, 31.
  48. Few-shot Bayesian optimization with deep kernel surrogates. arXiv preprint arXiv:2101.07667.
  49. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415: 295–316.
  50. Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(9): 3079 – 3090.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yang Zhang (1129 papers)
  2. Haiyang Wu (11 papers)
  3. Yuekui Yang (10 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets