Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym (2310.00077v3)
Abstract: The growing ubiquity of ML has led it to enter various areas of computer science, including black-box optimization (BBO). Recent research is particularly concerned with Bayesian optimization (BO). BO-based algorithms are popular in the ML community, as they are used for hyperparameter optimization and more generally for algorithm configuration. However, their efficiency decreases as the dimensionality of the problem and the budget of evaluations increase. Meanwhile, derivative-free optimization methods have evolved independently in the optimization community. Therefore, we urge to understand whether cross-fertilization is possible between the two communities, ML and BBO, i.e., whether algorithms that are heavily used in ML also work well in BBO and vice versa. Comparative experiments often involve rather small benchmarks and show visible problems in the experimental setup, such as poor initialization of baselines, overfitting due to problem-specific setting of hyperparameters, and low statistical significance. With this paper, we update and extend a comparative study presented by Hutter et al. in 2013. We compare BBO tools for ML with more classical heuristics, first on the well-known BBOB benchmark suite from the COCO environment and then on Direct Policy Search for OpenAI Gym, a reinforcement learning benchmark. Our results confirm that BO-based optimizers perform well on both benchmarks when budgets are limited, albeit with a higher computational cost, while they are often outperformed by algorithms from other families when the evaluation budget becomes larger. We also show that some algorithms from the BBO community perform surprisingly well on ML tasks.
- I. Bajaj, A. Arora, and M. M. F. Hasan, “Black-Box Optimization: Methods and Applications,” in Black Box Optimization, Machine Learning, and No-Free Lunch Theorems, ser. Springer Optimization and Its Applications, P. M. Pardalos, V. Rasskazova, and M. N. Vrahatis, Eds. Cham: Springer International Publishing, 2021, pp. 35–65.
- N. Hansen, A. Auger, R. Ros, O. Mersmann, T. Tušar, and D. Brockhoff, “COCO: A platform for comparing continuous optimizers in a black-box setting,” Optimization Methods and Software, vol. 36, no. 1, pp. 114–144, Jan. 2021.
- K. Tang, X. Li, P. N. Suganthan, Z. Yang, and T. Weise, “Benchmark Functions for the CEC’2010 Special Session and Competition on Large-Scale Global Optimization,” University of Science and Technology of China, Tech. Rep., 2010.
- J. Rapin and O. Teytaud, “Nevergrad - A gradient-free optimization platform,” https://GitHub.com/FacebookResearch/Nevergrad, 2018.
- C. Doerr, H. Wang, F. Ye, S. van Rijn, and T. Bäck, “IOHprofiler: A Benchmarking and Profiling Tool for Iterative Optimization Heuristics,” arXiv e-prints:1810.05281, Oct. 2018. [Online]. Available: https://arxiv.org/abs/1810.05281
- J. Rapin, M. Gallagher, P. Kerschke, M. Preuss, and O. Teytaud, “Exploring the MLDA benchmark on the nevergrad platform,” in Proceedings of the Genetic and Evolutionary Computation Conference Companion, ser. GECCO ’19. New York, NY, USA: Association for Computing Machinery, Jul. 2019, pp. 1888–1896.
- B. Bischl, P. Kerschke, L. Kotthoff, M. Lindauer, Y. Malitsky, A. Fréchette, H. H. Hoos, F. Hutter, K. Leyton-Brown, K. Tierney, and J. Vanschoren, “ASlib: A benchmark library for algorithm selection,” Artif. Intell., vol. 237, pp. 41–58, 2016. [Online]. Available: https://doi.org/10.1016/j.artint.2016.04.003
- F. Hutter, M. López-Ibáñez, C. Fawcett, M. Lindauer, H. H. Hoos, K. Leyton-Brown, and T. Stützle, “AClib: A benchmark library for algorithm configuration,” in Proc. of Learning and Intelligent Optimization (LION’14), ser. LNCS, vol. 8426. Springer, 2014, pp. 36–40. [Online]. Available: https://doi.org/10.1007/978-3-319-09584-4_4
- Y. Mehta, C. White, A. Zela, A. Krishnakumar, G. Zabergja, S. Moradian, M. Safari, K. Yu, and F. Hutter, “NAS-bench-suite: NAS evaluation is (now) surprisingly easy,” in The Tenth International Conference on Learning Representations, ICLR 2022. OpenReview.net, 2022. [Online]. Available: https://openreview.net/forum?id=0DLwqQLmqV
- A. Zela, J, Siems, and F. Hutter, “NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search,” in Proc. of 8th International Conference on Learning Representations (ICLR’20). OpenReview.net, 2020. [Online]. Available: https://openreview.net/forum?id=SJx9ngStPH
- L. Bliek, A. Guijt, R. Karlsson, S. Verwer, and M. de Weerdt, “Expobench: Benchmarking surrogate-based optimisation algorithms on expensive black-box functions,” CoRR, vol. abs/2106.04618, 2021. [Online]. Available: https://arxiv.org/abs/2106.04618
- T. Bartz-Beielstein, C. Doerr, J. Bossek, S. Chandrasekaran, T. Eftimov, A. Fischbach, P. Kerschke, M. López-Ibáñez, K. M. Malan, J. H. Moore, B. Naujoks, P. Orzechowski, V. Volz, M. Wagner, and T. Weise, “Benchmarking in optimization: Best practice and open issues,” CoRR, vol. abs/2007.03488, 2020. [Online]. Available: https://arxiv.org/abs/2007.03488
- D. R. Jones, M. Schonlau, and W. J. Welch, “Efficient Global Optimization of Expensive Black-Box Functions,” Journal of Global Optimization, vol. 13, no. 4, pp. 455–492, Dec. 1998.
- A. Nayebi, A. Munteanu, and M. Poloczek, “A Framework for Bayesian Optimization in Embedded Subspaces,” in Proceedings of the 36th International Conference on Machine Learning. PMLR, May 2019, pp. 4752–4761.
- Z. Wang, F. Hutter, M. Zoghi, D. Matheson, and N. de Freitas, “Bayesian Optimization in a Billion Dimensions via Random Embeddings,” arXiv:1301.1942 [cs, stat], Jan. 2016.
- Z. Wang, C. Gehring, P. Kohli, and S. Jegelka, “Batched Large-scale Bayesian Optimization in High-dimensional Spaces,” arXiv:1706.01445 [cs, math, stat], May 2018.
- L. Wang, R. Fonseca, and Y. Tian, “Learning search space partition for black-box optimization using monte carlo tree search,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. [Online]. Available: https://proceedings.neurips.cc/paper/2020/hash/e2ce14e81dba66dbff9cbc35ecfdb704-Abstract.html
- R. Coulom, “Efficient selectivity and backup operators in Monte-Carlo tree search,” in Proceedings of the 5th International Conference on Computers and Games, ser. CG’06. Berlin, Heidelberg: Springer-Verlag, 2007, pp. 72–83.
- R. Munos, “Optimistic optimization of a deterministic function without the knowledge of its smoothness,” in Proceedings of the 24th International Conference on Neural Information Processing Systems, ser. NIPS’11. Red Hook, NY, USA: Curran Associates Inc., 2011, p. 783–791.
- R. Turner, D. Eriksson, M. McCourt, J. Kiili, E. Laaksonen, Z. Xu, and I. Guyon, “Bayesian optimization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimization challenge 2020,” in NeurIPS 2020 Competition and Demonstration Track, vol. 133. PMLR, 2020, pp. 3–26. [Online]. Available: http://proceedings.mlr.press/v133/turner21a.html
- C. Cartis, J. Fiala, B. Marteau, and L. Roberts, “Improving the flexibility and robustness of model-based derivative-free optimization solvers,” 2018.
- M. J. Powell, “An efficient method for finding the minimum of a function of several variables without calculating derivatives,” The Computer Journal, vol. 7, no. 2, pp. 155–162, 1964.
- N. Hansen and A. Ostermeier, “Completely derandomized self-adaptation in evolution strategies,” Evolutionary Computation, vol. 11, no. 1, 2003.
- A. Auger, M. Schoenauer, and O. Teytaud, “Local and global order 3/2 convergence of a surrogate evolutionary algorithm,” in Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation, ser. GECCO ’05. New York, NY, USA: Association for Computing Machinery, Jun. 2005, pp. 857–864.
- K. Khowaja, M. Shcherbatyy, and W. K. Härdle, “Surrogate models for optimization of dynamical systems,” 2021.
- Z. Luksic, J. Tanevski, S. Dzeroski, and L. Todorovski, “Meta-model framework for surrogate-based parameter estimation in dynamical systems,” IEEE Access, vol. 7, pp. 181 829–181 841, 2019.
- F. Hutter, H. Hoos, and K. Leyton-Brown, “An evaluation of sequential model-based optimization for expensive blackbox functions,” in Proceedings of the 15th Annual Conference Companion on Genetic and Evolutionary Computation, ser. GECCO ’13 Companion. New York, NY, USA: Association for Computing Machinery, 2013, p. 1209–1216. [Online]. Available: https://doi.org/10.1145/2464576.2501592
- G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “OpenAI Gym,” (arXiv, 2016) Available: https://arxiv.org/pdf/1606.01540
- T. L. Lai and H. Robbins, “Asymptotically efficient adaptive allocation rules,” Advances in applied mathematics, vol. 6, no. 1, pp. 4–22, 1985.
- N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger, “Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design,” IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 3250–3265, May 2012.
- D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of Go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, jan 2016.
- D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. P. Lillicrap, K. Simonyan, and D. Hassabis, “Mastering chess and shogi by self-play with a general reinforcement learning algorithm,” CoRR, vol. abs/1712.01815, 2017. [Online]. Available: http://arxiv.org/abs/1712.01815
- R. Munos, “From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning,” Tech. Rep., 2014.
- J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian optimization of machine learning algorithms,” in Advances in Neural Information Processing Systems (NIPS), 2012, pp. 2951–2959.
- F. Nogueira, “Bayesian Optimization: Open source constrained global optimization tool for Python,” 2014. [Online]. Available: https://github.com/fmfn/BayesianOptimization
- D. Eriksson, M. Pearce, J. Gardner, R. D. Turner, and M. Poloczek, “Scalable global optimization via local Bayesian optimization,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019. [Online]. Available: https://proceedings.neurips.cc/paper/2019/file/6c990b7aca7bc7058f5e98ea909e924b-Paper.pdf
- FacebookResearch, “Ax - adaptive experimentation,” ax.dev, 2020.
- F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-based optimization for general algorithm configuration.” in LION, ser. Lecture Notes in Computer Science, C. A. C. Coello, Ed., vol. 6683. Springer, 2011, pp. 507–523. [Online]. Available: http://dblp.uni-trier.de/db/conf/lion/lion2011.html#HutterHL11
- J. Bergstra, B. Komer, C. Eliasmith, D. Yamins, and D. D. Cox, “Hyperopt: a python library for model selection and hyperparameter optimization,” Computational Science & Discovery, vol. 8, no. 1, p. 014008, 2015.
- T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019. ACM, 2019, pp. 2623–2631. [Online]. Available: https://doi.org/10.1145/3292500.3330701
- J. Kennedy, R. Eberhart, “Particle Swarm Optimization,” in Proceedings of IEEE International Conference on Neural Networks. IEEE, 1995, pp. 1942–1948. [Online]. Available: doi:10.1109/ICNN.1995.488968
- L. Meunier, H. Rakotoarison, P. Wong, B. Rozière, J. Rapin, O. Teytaud, A. Moreau, and C. Doerr, “Black-box optimization revisited: Improving algorithm selection wizards through massive benchmarking,” IEEE Trans. Evol. Comput., vol. 26, no. 3, pp. 490–500, 2022. [Online]. Available: https://doi.org/10.1109/TEVC.2021.3108185
- N. Hansen, A. Auger, S. Finck, and R. Ros, “Real-parameter black-box optimization benchmarking 2009: Experimental setup,” INRIA, France, Tech. Rep. RR-6828, 2009.
- M. Thoma, “RL-agents,” https://martin-thoma.com/rl-agents/.
- M. Thoma, “Q-Learning,” https://martin-thoma.com/q-learning/.
- A. Manukyan, M. A. Olivares-Mendez, M. Geist, and H. Voos, “Deep reinforcement learning-based continuous control for multicopter systems,” in 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT), 2019, pp. 1876–1881.
- W. C. Lewis-II, M. Moll, and L. E. Kavraki, “How much do unstated problem constraints limit deep robotic reinforcement learning?” CoRR, vol. abs/1909.09282, 2019. [Online]. Available: http://arxiv.org/abs/1909.09282
- R. Henry and D. Ernst, “Gym-ANM: Reinforcement learning environments for active network management tasks in electricity distribution systems,” 2021.
- A. Zubow, S. Rösler, P. Gawłowicz, and F. Dressler, “GrGym: When GNU Radio Goes to (AI) Gym,” in Proceedings of the 22nd International Workshop on Mobile Computing Systems and Applications, ser. HotMobile ’21. New York, NY, USA: Association for Computing Machinery, 2021, p. 8–14. [Online]. Available: https://doi.org/10.1145/3446382.3448358
- S. Green, C. M. Vineyard, and Ç. K. Koç, “Impacts of Mathematical Optimizations on Reinforcement Learning Policy Performance,” in 2018 International Joint Conference on Neural Networks (IJCNN), Jul. 2018, pp. 1–8.
- S. Sinha, H. Bharadhwaj, A. Srinivas, and A. Garg, “D2rl: Deep dense architectures in reinforcement learning,” 2020.
- F. Rezazadeh, H. Chergui, L. Alonso, and C. Verikoukis, “Continuous multi-objective zero-touch network slicing via twin delayed ddpg and openai gym,” 2021.
- J. Rapin and O. Teytaud, “Dashboard of results for Nevergrad platform,” https://dl.fbaipublicfiles.com/nevergrad/allxps/list.html, 2020.
- N. Hansen, A. Auger, D. Brockhoff, and T. Tusar, “Anytime performance assessment in blackbox optimization benchmarking,” IEEE Trans. Evol. Comput., vol. 26, no. 6, pp. 1293–1305, 2022. [Online]. Available: https://doi.org/10.1109/TEVC.2022.3210897
- E. Raponi, N. R. Carraz, J. Rapin, C. Doerr, and O. Teytaud, “Comparison of Low-budget Black-box Optimization Algorithms on BBOB,” in Zenodo, Sep. 2023. [Online]. Available: https://doi.org/10.5281/zenodo.8375417
- H. Wang, D. Vermetten, F. Ye, C. Doerr, and T. Bäck, “IOHanalyzer: Detailed Performance Analyses for Iterative Optimization Heuristics,” in ACM Transactions on Evolutionary Learning and Optimization, vol. 2, no. 3, pp. 1-–29, 2022. [Online]. Available: https://doi.org/10.1145/3510426
- S. Rahnamayan, H. R. Tizhoosh, and M. M. A. Salama, “Quasi-oppositional differential evolution,” in 2007 IEEE Congress on Evolutionary Computation, Sep. 2007, pp. 2229–2236.
- L. Meunier, C. Doerr, J. Rapin, and O. Teytaud, “Variance reduction for better sampling in continuous domains,” in Parallel Problem Solving from Nature - PPSN XVI - 16th International Conference, PPSN 2020, Leiden, The Netherlands, September 5-9, 2020, Proceedings, Part I, ser. Lecture Notes in Computer Science, vol. 12269. Springer, 2020, pp. 154–168.
- M. Cauwet, C. Couprie, J. Dehos, P. Luc, J. Rapin, M. Rivière, F. Teytaud, O. Teytaud, and N. Usunier, “Fully parallel hyperparameter search: Reshaped space-filling,” in Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, vol. 119. PMLR, 2020, pp. 1338–1348. [Online]. Available: http://proceedings.mlr.press/v119/cauwet20a.html
- M. López-Ibáñez, J. Branke, and L. Paquete, “Reproducibility in evolutionary computation,” ACM Trans. Evol. Learn. Optim., vol. 1, no. 4, pp. 14:1–14:21, 2021. [Online]. Available: https://doi.org/10.1145/3466624
- J. H. Holland, “Adaptation in Natural and Artificial Systems,” University of Michigan Press, 1975.
- V. Khalidov, M. Oquab, J. Rapin, and O. Teytaud, “Consistent population control: generate plenty of points, but with a bit of resampling,” Proceedings of the 15th ACM/SIGEVO Conference on Foundations of Genetic Algorithms, FOGA 2019, 116–123, Potsdam, Germany, August 27-29, 2019. Available: https://doi.org/10.1145/3299904.3340312
- R. Storn and K. Price, “Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces,” Journal of Global Optimization, vol. 11, pp. 341-–359. 1997. https://doi.org/10.1023/A:1008202821328
- M. A. Schumer and K. Steiglitz, “Adaptive Step Size Random Search,” IEEE Trans. Automat., vol. AC-13, no. 3, pp. 270–276, 1968.
- Elena Raponi (14 papers)
- Nathanael Rakotonirina Carraz (1 paper)
- Jérémy Rapin (20 papers)
- Carola Doerr (117 papers)
- Olivier Teytaud (45 papers)