Framework and Benchmarks for Combinatorial and Mixed-variable Bayesian Optimization (2306.09803v3)
Abstract: This paper introduces a modular framework for Mixed-variable and Combinatorial Bayesian Optimization (MCBO) to address the lack of systematic benchmarking and standardized evaluation in the field. Current MCBO papers often introduce non-diverse or non-standard benchmarks to evaluate their methods, impeding the proper assessment of different MCBO primitives and their combinations. Additionally, papers introducing a solution for a single MCBO primitive often omit benchmarking against baselines that utilize the same methods for the remaining primitives. This omission is primarily due to the significant implementation overhead involved, resulting in a lack of controlled assessments and an inability to showcase the merits of a contribution effectively. To overcome these challenges, our proposed framework enables an effortless combination of Bayesian Optimization components, and provides a diverse set of synthetic and real-world benchmarking tasks. Leveraging this flexibility, we implement 47 novel MCBO algorithms and benchmark them against seven existing MCBO solvers and five standard black-box optimization algorithms on ten tasks, conducting over 4000 experiments. Our findings reveal a superior combination of MCBO primitives outperforming existing approaches and illustrate the significance of model fit and the use of a trust region. We make our MCBO library available under the MIT license at \url{https://github.com/huawei-noah/HEBO/tree/master/MCBO}.
- Bayesian optimization of combinatorial structures. In International Conference on Machine Learning, pages 462–471. PMLR, 2018.
- Combinatorial bayesian optimization using the graph cartesian product. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper/2019/file/2cb6b10338a7fc4117a80da24b582060-Paper.pdf.
- Bayesian optimisation over multiple continuous and categorical inputs. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 8276–8285. PMLR, 13–18 Jul 2020. URL https://proceedings.mlr.press/v119/ru20a.html.
- BOiLS: Bayesian Optimisation for Logic Synthesis. In Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe, DATE ’22, page 1193–1196, Leuven, BEL, Mar 2022. European Design and Automation Association. ISBN 9783981926361.
- Think global and act local: Bayesian optimisation over high-dimensional categorical and mixed search spaces. International Conference on Machine Learning, 2021.
- Bayesian optimization over high-dimensional combinatorial spaces via dictionary-based embeddings. CoRR, abs/2303.01774, 2023. doi: 10.48550/arXiv.2303.01774. URL https://doi.org/10.48550/arXiv.2303.01774.
- BOSS: Bayesian optimization over string spaces. In NeurIPS, 2020. URL https://proceedings.neurips.cc/paper/2020/hash/b19aa25ff58940d974234b48391b9549-Abstract.html.
- Hierarchical design of an integrated production and 2-echelon distribution system. European Journal of Operational Research, 118(3):464–484, 1999. ISSN 0377-2217. doi: https://doi.org/10.1016/S0377-2217(98)00317-8. URL https://www.sciencedirect.com/science/article/pii/S0377221798003178.
- Simulation optimisation methods in supply chain applications: a review. Irish Journal of Management, 30:95, 2009.
- The truck dispatching problem. Management Science, 6:80–91, 10 1959.
- Samuel Raff. Routing and scheduling of vehicles and crews: The state of the art. Computers & Operations Research, 10(2):63–211, 1983. ISSN 0305-0548. doi: https://doi.org/10.1016/0305-0548(83)90030-8. URL https://www.sciencedirect.com/science/article/pii/0305054883900308. Routing and Scheduling of Vehicles and Crews. The State of the Art.
- Optimization and approximation in deterministic sequencing and scheduling: a survey. In P.L. Hammer, E.L. Johnson, and B.H. Korte, editors, Discrete Optimization II, volume 5 of Annals of Discrete Mathematics, pages 287–326. Elsevier, 1979. doi: https://doi.org/10.1016/S0167-5060(08)70356-X. URL https://www.sciencedirect.com/science/article/pii/S016750600870356X.
- A novel multi-objective particle swarm optimization algorithm for no-wait flow shop scheduling problems. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 222(4):519–539, 2008. doi: 10.1243/09544054JEM989. URL https://doi.org/10.1243/09544054JEM989.
- Harry Markowitz. Portfolio selection. The Journal of Finance, 7(1):77–91, 1952. ISSN 00221082, 15406261. URL http://www.jstor.org/stable/2975974.
- A portfolio selection model based on the knapsack problem under uncertainty. PLoS One, 14(5):e0213652, May 2019.
- G. B. Mathews. On the Partition of Numbers. Proceedings of the London Mathematical Society, s1-28(1):486–490, 11 1896. ISSN 0024-6115. doi: 10.1112/plms/s1-28.1.486. URL https://doi.org/10.1112/plms/s1-28.1.486.
- J. B. Robinson. On the Hamiltonian game (a traveling-salesman problem). RAND Corporation, Santa Monica, CA, 1949.
- Network Flow Algorithms. Princeton Univ NJ Dept of Computer Science, 1989.
- Or-tools, 2022. URL https://developers.google.com/optimization/.
- IBM ILOG Cplex. V12. 1: User’s manual for cplex. International Business Machines Corporation, 46(53):157, 2009.
- Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2022. URL https://www.gurobi.com.
- Java Binding. Gnu linear programming kit, 2011.
- Optimization by simulated annealing. Science, 220(4598):671–680, 1983. doi: 10.1126/science.220.4598.671. URL https://www.science.org/doi/abs/10.1126/science.220.4598.671.
- V. Černý. Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm. J. Optim. Theory Appl., 45(1):41–51, jan 1985. ISSN 0022-3239. doi: 10.1007/BF00940812. URL https://doi.org/10.1007/BF00940812.
- Darrell Whitley. A genetic algorithm tutorial. Statistics and Computing, 4(2):65–85, Jun 1994. ISSN 1573-1375. doi: 10.1007/BF00175354. URL https://doi.org/10.1007/BF00175354.
- Introduction to Evolutionary Computing. Springer Publishing Company, Incorporated, 2nd edition, 2015. ISBN 3662448734.
- The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32(1):48–77, 2002. doi: 10.1137/S0097539701398375. URL https://doi.org/10.1137/S0097539701398375.
- Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning, volume 2. MIT Press, 2006. ISBN 978-0-262-18253-9.
- Text classification using string kernels. J. Mach. Learn. Res., 2:419–444, mar 2002. ISSN 1532-4435. doi: 10.1162/153244302760200687. URL https://doi.org/10.1162/153244302760200687.
- Word sequence kernels. J. Mach. Learn. Res., 3(null):1059–1082, mar 2003. ISSN 1532-4435.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033, 2012. doi: 10.1109/IROS.2012.6386109.
- Sonja Surjanovic and Derek Bringham, 2013. URL https://www.sfu.ca/~ssurjano/optimization.html.
- Practical bayesian optimization of machine learning algorithms. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012. URL https://proceedings.neurips.cc/paper/2012/file/05311655a15b75fab86956663e1819cd-Paper.pdf.
- The GPyOpt authors. GPyOpt: A bayesian optimization framework in python. http://github.com/SheffieldML/GPyOpt, 2016.
- The parallel knowledge gradient method for batch bayesian optimization. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016. URL https://proceedings.neurips.cc/paper_files/paper/2016/file/18d10dc6e666eab6de9215ae5b3d54df-Paper.pdf.
- Robo: A flexible and robust bayesian optimization framework in python. In NIPS 2017 Bayesian Optimization Workshop, December 2017.
- Emulation of physical processes with emukit. In Second Workshop on Machine Learning and the Physical Sciences, NeurIPS, 2019.
- Tuning hyperparameters without grad students: Scalable and robust bayesian optimisation with dragonfly. Journal of Machine Learning Research, 21(81):1–27, 2020. URL http://jmlr.org/papers/v21/18-223.html.
- Probo: Versatile bayesian optimization using any probabilistic programming language, 2019.
- GPflowOpt: A Bayesian Optimization Library using TensorFlow. arXiv preprint – arXiv:1711.03845, 2017. URL https://arxiv.org/abs/1711.03845.
- BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. In Advances in Neural Information Processing Systems 33, 2020. URL http://arxiv.org/abs/1910.06403.
- Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. In Advances in Neural Information Processing Systems, 2018.
- Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
- Mixed-integer benchmark problems for single- and bi-objective optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’19, page 718–726, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450361118. doi: 10.1145/3321707.3321868. URL https://doi.org/10.1145/3321707.3321868.
- Coco: a platform for comparing continuous optimizers in a black-box setting. Optimization Methods and Software, 36(1):114–144, 2021. doi: 10.1080/10556788.2020.1808977. URL https://doi.org/10.1080/10556788.2020.1808977.
- HPOBench: A collection of reproducible multi-fidelity benchmark problems for HPO. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021. URL https://openreview.net/forum?id=1k4rJYEwda-.
- HPO-B: A large-scale reproducible benchmark for black-box HPO based on openml. Neural Information Processing Systems (NeurIPS) Track on Datasets and Benchmarks, 2021.
- Yahpo gym - an efficient multi-objective multi-fidelity benchmark for hyperparameter optimization. In Isabelle Guyon, Marius Lindauer, Mihaela van der Schaar, Frank Hutter, and Roman Garnett, editors, Proceedings of the First International Conference on Automated Machine Learning, volume 188 of Proceedings of Machine Learning Research, pages 3/1–39. PMLR, 25–27 Jul 2022. URL https://proceedings.mlr.press/v188/pfisterer22a.html.
- Bayesian Data Analysis. Chapman and Hall/CRC, 2nd ed. edition, 2004.
- Leo Breiman. Random forests. Machine Learning, 45(1):5–32, Oct 2001. ISSN 1573-0565. doi: 10.1023/A:1010933404324. URL https://doi.org/10.1023/A:1010933404324.
- Algorithms for hyper-parameter optimization. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011. URL https://proceedings.neurips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf.
- Bayesian neural networks: An introduction and survey. In Kerrie L. Mengersen, Pierre Pudlo, and Christian P. Robert, editors, Case Studies in Applied Bayesian Data Science: CIRM Jean-Morlet Chair, Fall 2018, pages 45–87. Springer International Publishing, Cham, 2020. ISBN 978-3-030-42553-1. doi: 10.1007/978-3-030-42553-1_3. URL https://doi.org/10.1007/978-3-030-42553-1_3.
- Gaussian processes for regression. In D. Touretzky, M. C. Mozer, and M. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8. MIT Press, 1995. URL https://proceedings.neurips.cc/paper/1995/file/7cce53cf90577442771720a370c3c723-Paper.pdf.
- An Empirical Study of Assumptions in Bayesian Optimisation. arXiv e-prints, art. arXiv:2012.03826, December 2020.
- Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the Nineteenth International Conference on Machine Learning, ICML ’02, page 315–322, San Francisco, CA, USA, 2002. Morgan Kaufmann Publishers Inc. ISBN 1558608737.
- Frank Hutter. Automated Configuration of Algorithms for Solving Hard Computational Problems. PhD thesis, Technical University of Darmstadt, 2004.
- Gaussian process regression with automatic relevance determination kernel for calendar aging prediction of lithium-ion batteries. IEEE Transactions on Industrial Informatics, 16(6):3767–3777, 2020. doi: 10.1109/TII.2019.2941747.
- The application of Bayesian methods for seeking the extremum. Towards Global Optimization, 2(117-129):2, 1978.
- Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13(4):455–492, Dec 1998. ISSN 1573-2916. doi: 10.1023/A:1008306431147. URL https://doi.org/10.1023/A:1008306431147.
- H. J. Kushner. A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise. Journal of Basic Engineering, 86(1):97–106, 03 1964. ISSN 0021-9223. doi: 10.1115/1.3653121. URL https://doi.org/10.1115/1.3653121.
- Gaussian process optimization in the bandit setting: No regret and experimental design. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, page 1015–1022, Madison, WI, USA, 2010. Omnipress. ISBN 9781605589077.
- A tutorial on thompson sampling. Found. Trends Mach. Learn., 11(1):1–96, jul 2018. ISSN 1935-8237. doi: 10.1561/2200000070. URL https://doi.org/10.1561/2200000070.
- Are we forgetting about compositional optimisers in bayesian optimisation? Journal of Machine Learning Research, 22(160):1–78, 2021. URL http://jmlr.org/papers/v22/20-1422.html.
- Bayesian optimization over discrete and mixed spaces via probabilistic reparameterization. In Advances in Neural Information Processing Systems 35, 2022.
- Scalable global optimization via local bayesian optimization. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019a. URL https://proceedings.neurips.cc/paper/2019/file/6c990b7aca7bc7058f5e98ea909e924b-Paper.pdf.
- The horseshoe estimator for sparse signals. Biometrika, 97(2):465–480, 2010. ISSN 00063444, 14643510. URL http://www.jstor.org/stable/25734098.
- Scalable global optimization via local Bayesian optimization. In Advances in Neural Information Processing Systems, pages 5496–5507, 2019b. URL http://papers.nips.cc/paper/8788-scalable-global-optimization-via-local-bayesian-optimization.pdf.
- Random search for hyper-parameter optimization. J. Mach. Learn. Res., 13:281–305, 2012. URL http://dblp.uni-trier.de/db/journals/jmlr/jmlr13.html#BergstraB12.
- Local search in combinatorial optimization, pages 157–174. Springer Berlin Heidelberg, Berlin, Heidelberg, 1995. ISBN 978-3-540-49283-2. doi: 10.1007/BFb0027029. URL https://doi.org/10.1007/BFb0027029.
- Alan Mishchenko et al. Abc: A system for sequential synthesis and verification, 2005. URL https://people.eecs.berkeley.edu/~alanmi/abc/.
- Antbo: Towards real-world automated antibody design with combinatorial bayesian optimisation. arXiv preprint arXiv:2201.12570, 2022.
- Nono S. C. Merleau and Matteo Smerlak. A simple evolutionary algorithm guided by local mutations for an efficient rna design. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’21, page 1027–1034, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383509. doi: 10.1145/3449639.3459280. URL https://doi.org/10.1145/3449639.3459280.
- Majority-inverter graph: A novel data-structure and algorithms for efficient logic optimization. In 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), pages 1–6, 2014a. doi: 10.1145/2593069.2593158.
- XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 785–794, New York, NY, USA, 2016. ACM. ISBN 978-1-4503-4232-2. doi: 10.1145/2939672.2939785. URL http://doi.acm.org/10.1145/2939672.2939785.
- Li Deng. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
- Training nu-support vector regression: theory and algorithms. Neural Comput, 14(8):1959–1977, Aug 2002.
- UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.
- Batch bayesian optimization via local penalization. In Arthur Gretton and Christian C. Robert, editors, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, volume 51 of Proceedings of Machine Learning Research, pages 648–657, Cadiz, Spain, 09–11 May 2016. PMLR. URL https://proceedings.mlr.press/v51/gonzalez16a.html.
- GPflow: A Gaussian process library using TensorFlow. Journal of Machine Learning Research, 18(40):1–6, apr 2017. URL http://jmlr.org/papers/v18/16-537.html.
- Majority-inverter graph: A novel data-structure and algorithms for efficient logic optimization. In 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), pages 1–6, 2014b. doi: 10.1145/2593069.2593158.
- The EPFL logic synthesis libraries, November 2019. arXiv:1805.05121v2.
- The epfl combinational benchmark suite, February 2019. URL https://doi.org/10.5281/zenodo.2572934.
- One billion synthetic 3d-antibody-antigen complexes enable unconstrained machine-learning formalized investigation of antibody specificity prediction. BioRXiV, 2021.
- Five computational developability guidelines for therapeutic antibody profiling. Proceedings of the National Academy of Sciences, 116(10):4025–4030, 2019.
- Nanoparticle-mediated delivery of siRNA targeting parp1 extends survival of mice bearing tumors derived from brca1-deficient ovarian cancer cells. Proc Natl Acad Sci U S A, 108(2):745–750, December 2010.
- Nndb: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic acids research, 38(Database issue):D280–D282, Jan 2010. ISSN 1362-4962. doi: 10.1093/nar/gkp892. URL https://pubmed.ncbi.nlm.nih.gov/19880381.
- Construction of rna nanocages by re-engineering the packaging rna of phi29 bacteriophage. Nature Communications, 5(1):3890, May 2014. doi: 10.1038/ncomms4890. URL https://doi.org/10.1038/ncomms4890.
- Viennarna package 2.0. Algorithms for Molecular Biology, 6(1):26, Nov 2011. ISSN 1748-7188. doi: 10.1186/1748-7188-6-26. URL https://doi.org/10.1186/1748-7188-6-26.
- Puzzles in the Eterna100 benchmark., 6 2018. URL https://plos.figshare.com/articles/dataset/Puzzles_in_the_Eterna100_benchmark_/6638837.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Adam: A method for stochastic optimization. International Conference on Learning Representations, 12 2014.
- Kriging Is Well-Suited to Parallelize Optimization, pages 131–162. Springer Berlin Heidelberg, Berlin, Heidelberg, 2010. ISBN 978-3-642-10701-6. doi: 10.1007/978-3-642-10701-6_6. URL https://doi.org/10.1007/978-3-642-10701-6_6.