Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym (2310.00077v3)

Published 29 Sep 2023 in cs.LG

Abstract: The growing ubiquity of ML has led it to enter various areas of computer science, including black-box optimization (BBO). Recent research is particularly concerned with Bayesian optimization (BO). BO-based algorithms are popular in the ML community, as they are used for hyperparameter optimization and more generally for algorithm configuration. However, their efficiency decreases as the dimensionality of the problem and the budget of evaluations increase. Meanwhile, derivative-free optimization methods have evolved independently in the optimization community. Therefore, we urge to understand whether cross-fertilization is possible between the two communities, ML and BBO, i.e., whether algorithms that are heavily used in ML also work well in BBO and vice versa. Comparative experiments often involve rather small benchmarks and show visible problems in the experimental setup, such as poor initialization of baselines, overfitting due to problem-specific setting of hyperparameters, and low statistical significance. With this paper, we update and extend a comparative study presented by Hutter et al. in 2013. We compare BBO tools for ML with more classical heuristics, first on the well-known BBOB benchmark suite from the COCO environment and then on Direct Policy Search for OpenAI Gym, a reinforcement learning benchmark. Our results confirm that BO-based optimizers perform well on both benchmarks when budgets are limited, albeit with a higher computational cost, while they are often outperformed by algorithms from other families when the evaluation budget becomes larger. We also show that some algorithms from the BBO community perform surprisingly well on ML tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. I. Bajaj, A. Arora, and M. M. F. Hasan, “Black-Box Optimization: Methods and Applications,” in Black Box Optimization, Machine Learning, and No-Free Lunch Theorems, ser. Springer Optimization and Its Applications, P. M. Pardalos, V. Rasskazova, and M. N. Vrahatis, Eds.   Cham: Springer International Publishing, 2021, pp. 35–65.
  2. N. Hansen, A. Auger, R. Ros, O. Mersmann, T. Tušar, and D. Brockhoff, “COCO: A platform for comparing continuous optimizers in a black-box setting,” Optimization Methods and Software, vol. 36, no. 1, pp. 114–144, Jan. 2021.
  3. K. Tang, X. Li, P. N. Suganthan, Z. Yang, and T. Weise, “Benchmark Functions for the CEC’2010 Special Session and Competition on Large-Scale Global Optimization,” University of Science and Technology of China, Tech. Rep., 2010.
  4. J. Rapin and O. Teytaud, “Nevergrad - A gradient-free optimization platform,” https://GitHub.com/FacebookResearch/Nevergrad, 2018.
  5. C. Doerr, H. Wang, F. Ye, S. van Rijn, and T. Bäck, “IOHprofiler: A Benchmarking and Profiling Tool for Iterative Optimization Heuristics,” arXiv e-prints:1810.05281, Oct. 2018. [Online]. Available: https://arxiv.org/abs/1810.05281
  6. J. Rapin, M. Gallagher, P. Kerschke, M. Preuss, and O. Teytaud, “Exploring the MLDA benchmark on the nevergrad platform,” in Proceedings of the Genetic and Evolutionary Computation Conference Companion, ser. GECCO ’19.   New York, NY, USA: Association for Computing Machinery, Jul. 2019, pp. 1888–1896.
  7. B. Bischl, P. Kerschke, L. Kotthoff, M. Lindauer, Y. Malitsky, A. Fréchette, H. H. Hoos, F. Hutter, K. Leyton-Brown, K. Tierney, and J. Vanschoren, “ASlib: A benchmark library for algorithm selection,” Artif. Intell., vol. 237, pp. 41–58, 2016. [Online]. Available: https://doi.org/10.1016/j.artint.2016.04.003
  8. F. Hutter, M. López-Ibáñez, C. Fawcett, M. Lindauer, H. H. Hoos, K. Leyton-Brown, and T. Stützle, “AClib: A benchmark library for algorithm configuration,” in Proc. of Learning and Intelligent Optimization (LION’14), ser. LNCS, vol. 8426.   Springer, 2014, pp. 36–40. [Online]. Available: https://doi.org/10.1007/978-3-319-09584-4_4
  9. Y. Mehta, C. White, A. Zela, A. Krishnakumar, G. Zabergja, S. Moradian, M. Safari, K. Yu, and F. Hutter, “NAS-bench-suite: NAS evaluation is (now) surprisingly easy,” in The Tenth International Conference on Learning Representations, ICLR 2022.   OpenReview.net, 2022. [Online]. Available: https://openreview.net/forum?id=0DLwqQLmqV
  10. A. Zela, J, Siems, and F. Hutter, “NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search,” in Proc. of 8th International Conference on Learning Representations (ICLR’20).   OpenReview.net, 2020. [Online]. Available: https://openreview.net/forum?id=SJx9ngStPH
  11. L. Bliek, A. Guijt, R. Karlsson, S. Verwer, and M. de Weerdt, “Expobench: Benchmarking surrogate-based optimisation algorithms on expensive black-box functions,” CoRR, vol. abs/2106.04618, 2021. [Online]. Available: https://arxiv.org/abs/2106.04618
  12. T. Bartz-Beielstein, C. Doerr, J. Bossek, S. Chandrasekaran, T. Eftimov, A. Fischbach, P. Kerschke, M. López-Ibáñez, K. M. Malan, J. H. Moore, B. Naujoks, P. Orzechowski, V. Volz, M. Wagner, and T. Weise, “Benchmarking in optimization: Best practice and open issues,” CoRR, vol. abs/2007.03488, 2020. [Online]. Available: https://arxiv.org/abs/2007.03488
  13. D. R. Jones, M. Schonlau, and W. J. Welch, “Efficient Global Optimization of Expensive Black-Box Functions,” Journal of Global Optimization, vol. 13, no. 4, pp. 455–492, Dec. 1998.
  14. A. Nayebi, A. Munteanu, and M. Poloczek, “A Framework for Bayesian Optimization in Embedded Subspaces,” in Proceedings of the 36th International Conference on Machine Learning.   PMLR, May 2019, pp. 4752–4761.
  15. Z. Wang, F. Hutter, M. Zoghi, D. Matheson, and N. de Freitas, “Bayesian Optimization in a Billion Dimensions via Random Embeddings,” arXiv:1301.1942 [cs, stat], Jan. 2016.
  16. Z. Wang, C. Gehring, P. Kohli, and S. Jegelka, “Batched Large-scale Bayesian Optimization in High-dimensional Spaces,” arXiv:1706.01445 [cs, math, stat], May 2018.
  17. L. Wang, R. Fonseca, and Y. Tian, “Learning search space partition for black-box optimization using monte carlo tree search,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. [Online]. Available: https://proceedings.neurips.cc/paper/2020/hash/e2ce14e81dba66dbff9cbc35ecfdb704-Abstract.html
  18. R. Coulom, “Efficient selectivity and backup operators in Monte-Carlo tree search,” in Proceedings of the 5th International Conference on Computers and Games, ser. CG’06.   Berlin, Heidelberg: Springer-Verlag, 2007, pp. 72–83.
  19. R. Munos, “Optimistic optimization of a deterministic function without the knowledge of its smoothness,” in Proceedings of the 24th International Conference on Neural Information Processing Systems, ser. NIPS’11.   Red Hook, NY, USA: Curran Associates Inc., 2011, p. 783–791.
  20. R. Turner, D. Eriksson, M. McCourt, J. Kiili, E. Laaksonen, Z. Xu, and I. Guyon, “Bayesian optimization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimization challenge 2020,” in NeurIPS 2020 Competition and Demonstration Track, vol. 133.   PMLR, 2020, pp. 3–26. [Online]. Available: http://proceedings.mlr.press/v133/turner21a.html
  21. C. Cartis, J. Fiala, B. Marteau, and L. Roberts, “Improving the flexibility and robustness of model-based derivative-free optimization solvers,” 2018.
  22. M. J. Powell, “An efficient method for finding the minimum of a function of several variables without calculating derivatives,” The Computer Journal, vol. 7, no. 2, pp. 155–162, 1964.
  23. N. Hansen and A. Ostermeier, “Completely derandomized self-adaptation in evolution strategies,” Evolutionary Computation, vol. 11, no. 1, 2003.
  24. A. Auger, M. Schoenauer, and O. Teytaud, “Local and global order 3/2 convergence of a surrogate evolutionary algorithm,” in Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation, ser. GECCO ’05.   New York, NY, USA: Association for Computing Machinery, Jun. 2005, pp. 857–864.
  25. K. Khowaja, M. Shcherbatyy, and W. K. Härdle, “Surrogate models for optimization of dynamical systems,” 2021.
  26. Z. Luksic, J. Tanevski, S. Dzeroski, and L. Todorovski, “Meta-model framework for surrogate-based parameter estimation in dynamical systems,” IEEE Access, vol. 7, pp. 181 829–181 841, 2019.
  27. F. Hutter, H. Hoos, and K. Leyton-Brown, “An evaluation of sequential model-based optimization for expensive blackbox functions,” in Proceedings of the 15th Annual Conference Companion on Genetic and Evolutionary Computation, ser. GECCO ’13 Companion.   New York, NY, USA: Association for Computing Machinery, 2013, p. 1209–1216. [Online]. Available: https://doi.org/10.1145/2464576.2501592
  28. G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “OpenAI Gym,” (arXiv, 2016) Available: https://arxiv.org/pdf/1606.01540
  29. T. L. Lai and H. Robbins, “Asymptotically efficient adaptive allocation rules,” Advances in applied mathematics, vol. 6, no. 1, pp. 4–22, 1985.
  30. N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger, “Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design,” IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 3250–3265, May 2012.
  31. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of Go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, jan 2016.
  32. D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. P. Lillicrap, K. Simonyan, and D. Hassabis, “Mastering chess and shogi by self-play with a general reinforcement learning algorithm,” CoRR, vol. abs/1712.01815, 2017. [Online]. Available: http://arxiv.org/abs/1712.01815
  33. R. Munos, “From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning,” Tech. Rep., 2014.
  34. J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian optimization of machine learning algorithms,” in Advances in Neural Information Processing Systems (NIPS), 2012, pp. 2951–2959.
  35. F. Nogueira, “Bayesian Optimization: Open source constrained global optimization tool for Python,” 2014. [Online]. Available: https://github.com/fmfn/BayesianOptimization
  36. D. Eriksson, M. Pearce, J. Gardner, R. D. Turner, and M. Poloczek, “Scalable global optimization via local Bayesian optimization,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32.   Curran Associates, Inc., 2019. [Online]. Available: https://proceedings.neurips.cc/paper/2019/file/6c990b7aca7bc7058f5e98ea909e924b-Paper.pdf
  37. FacebookResearch, “Ax - adaptive experimentation,” ax.dev, 2020.
  38. F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-based optimization for general algorithm configuration.” in LION, ser. Lecture Notes in Computer Science, C. A. C. Coello, Ed., vol. 6683.   Springer, 2011, pp. 507–523. [Online]. Available: http://dblp.uni-trier.de/db/conf/lion/lion2011.html#HutterHL11
  39. J. Bergstra, B. Komer, C. Eliasmith, D. Yamins, and D. D. Cox, “Hyperopt: a python library for model selection and hyperparameter optimization,” Computational Science & Discovery, vol. 8, no. 1, p. 014008, 2015.
  40. T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019.   ACM, 2019, pp. 2623–2631. [Online]. Available: https://doi.org/10.1145/3292500.3330701
  41. J. Kennedy, R. Eberhart, “Particle Swarm Optimization,” in Proceedings of IEEE International Conference on Neural Networks.   IEEE, 1995, pp. 1942–1948. [Online]. Available: doi:10.1109/ICNN.1995.488968
  42. L. Meunier, H. Rakotoarison, P. Wong, B. Rozière, J. Rapin, O. Teytaud, A. Moreau, and C. Doerr, “Black-box optimization revisited: Improving algorithm selection wizards through massive benchmarking,” IEEE Trans. Evol. Comput., vol. 26, no. 3, pp. 490–500, 2022. [Online]. Available: https://doi.org/10.1109/TEVC.2021.3108185
  43. N. Hansen, A. Auger, S. Finck, and R. Ros, “Real-parameter black-box optimization benchmarking 2009: Experimental setup,” INRIA, France, Tech. Rep. RR-6828, 2009.
  44. M. Thoma, “RL-agents,” https://martin-thoma.com/rl-agents/.
  45. M. Thoma, “Q-Learning,” https://martin-thoma.com/q-learning/.
  46. A. Manukyan, M. A. Olivares-Mendez, M. Geist, and H. Voos, “Deep reinforcement learning-based continuous control for multicopter systems,” in 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT), 2019, pp. 1876–1881.
  47. W. C. Lewis-II, M. Moll, and L. E. Kavraki, “How much do unstated problem constraints limit deep robotic reinforcement learning?” CoRR, vol. abs/1909.09282, 2019. [Online]. Available: http://arxiv.org/abs/1909.09282
  48. R. Henry and D. Ernst, “Gym-ANM: Reinforcement learning environments for active network management tasks in electricity distribution systems,” 2021.
  49. A. Zubow, S. Rösler, P. Gawłowicz, and F. Dressler, “GrGym: When GNU Radio Goes to (AI) Gym,” in Proceedings of the 22nd International Workshop on Mobile Computing Systems and Applications, ser. HotMobile ’21.   New York, NY, USA: Association for Computing Machinery, 2021, p. 8–14. [Online]. Available: https://doi.org/10.1145/3446382.3448358
  50. S. Green, C. M. Vineyard, and Ç. K. Koç, “Impacts of Mathematical Optimizations on Reinforcement Learning Policy Performance,” in 2018 International Joint Conference on Neural Networks (IJCNN), Jul. 2018, pp. 1–8.
  51. S. Sinha, H. Bharadhwaj, A. Srinivas, and A. Garg, “D2rl: Deep dense architectures in reinforcement learning,” 2020.
  52. F. Rezazadeh, H. Chergui, L. Alonso, and C. Verikoukis, “Continuous multi-objective zero-touch network slicing via twin delayed ddpg and openai gym,” 2021.
  53. J. Rapin and O. Teytaud, “Dashboard of results for Nevergrad platform,” https://dl.fbaipublicfiles.com/nevergrad/allxps/list.html, 2020.
  54. N. Hansen, A. Auger, D. Brockhoff, and T. Tusar, “Anytime performance assessment in blackbox optimization benchmarking,” IEEE Trans. Evol. Comput., vol. 26, no. 6, pp. 1293–1305, 2022. [Online]. Available: https://doi.org/10.1109/TEVC.2022.3210897
  55. E. Raponi, N. R. Carraz, J. Rapin, C. Doerr, and O. Teytaud, “Comparison of Low-budget Black-box Optimization Algorithms on BBOB,” in Zenodo, Sep. 2023. [Online]. Available: https://doi.org/10.5281/zenodo.8375417
  56. H. Wang, D. Vermetten, F. Ye, C. Doerr, and T. Bäck, “IOHanalyzer: Detailed Performance Analyses for Iterative Optimization Heuristics,” in ACM Transactions on Evolutionary Learning and Optimization, vol. 2, no. 3, pp. 1-–29, 2022. [Online]. Available: https://doi.org/10.1145/3510426
  57. S. Rahnamayan, H. R. Tizhoosh, and M. M. A. Salama, “Quasi-oppositional differential evolution,” in 2007 IEEE Congress on Evolutionary Computation, Sep. 2007, pp. 2229–2236.
  58. L. Meunier, C. Doerr, J. Rapin, and O. Teytaud, “Variance reduction for better sampling in continuous domains,” in Parallel Problem Solving from Nature - PPSN XVI - 16th International Conference, PPSN 2020, Leiden, The Netherlands, September 5-9, 2020, Proceedings, Part I, ser. Lecture Notes in Computer Science, vol. 12269.   Springer, 2020, pp. 154–168.
  59. M. Cauwet, C. Couprie, J. Dehos, P. Luc, J. Rapin, M. Rivière, F. Teytaud, O. Teytaud, and N. Usunier, “Fully parallel hyperparameter search: Reshaped space-filling,” in Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, vol. 119.   PMLR, 2020, pp. 1338–1348. [Online]. Available: http://proceedings.mlr.press/v119/cauwet20a.html
  60. M. López-Ibáñez, J. Branke, and L. Paquete, “Reproducibility in evolutionary computation,” ACM Trans. Evol. Learn. Optim., vol. 1, no. 4, pp. 14:1–14:21, 2021. [Online]. Available: https://doi.org/10.1145/3466624
  61. J. H. Holland, “Adaptation in Natural and Artificial Systems,” University of Michigan Press, 1975.
  62. V. Khalidov, M. Oquab, J. Rapin, and O. Teytaud, “Consistent population control: generate plenty of points, but with a bit of resampling,” Proceedings of the 15th ACM/SIGEVO Conference on Foundations of Genetic Algorithms, FOGA 2019, 116–123, Potsdam, Germany, August 27-29, 2019. Available: https://doi.org/10.1145/3299904.3340312
  63. R. Storn and K. Price, “Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces,” Journal of Global Optimization, vol. 11, pp. 341-–359. 1997. https://doi.org/10.1023/A:1008202821328
  64. M. A. Schumer and K. Steiglitz, “Adaptive Step Size Random Search,” IEEE Trans. Automat., vol. AC-13, no. 3, pp. 270–276, 1968.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Elena Raponi (14 papers)
  2. Nathanael Rakotonirina Carraz (1 paper)
  3. Jérémy Rapin (20 papers)
  4. Carola Doerr (117 papers)
  5. Olivier Teytaud (45 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.