Discrete Simulation Optimization for Tuning Machine Learning Method Hyperparameters (2201.05978v3)
Abstract: Machine learning (ML) methods are used in most technical areas such as image recognition, product recommendation, financial analysis, medical diagnosis, and predictive maintenance. An important aspect of implementing ML methods involves controlling the learning process for the ML method so as to maximize the performance of the method under consideration. Hyperparameter tuning is the process of selecting a suitable set of ML method parameters that control its learning process. In this work, we demonstrate the use of discrete simulation optimization methods such as ranking and selection (R&S) and random search for identifying a hyperparameter set that maximizes the performance of a ML method. Specifically, we use the KN R&S method and the stochastic ruler random search method and one of its variations for this purpose. We also construct the theoretical basis for applying the KN method, which determines the optimal solution with a statistical guarantee via solution space enumeration. In comparison, the stochastic ruler method asymptotically converges to global optima and incurs smaller computational overheads. We demonstrate the application of these methods to a wide variety of machine learning models, including deep neural network models used for time series prediction and image classification. We benchmark our application of these methods with state-of-the-art hyperparameter optimization libraries such as $hyperopt$ and $mango$. The KN method consistently outperforms $hyperopt$'s random search (RS) and Tree of Parzen Estimators (TPE) methods. The stochastic ruler method outperforms the $hyperopt$ RS method and offers statistically comparable performance with respect to $hyperopt$'s TPE method and the $mango$ algorithm.
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleOptuna: A Next-generation Hyperparameter Optimization Framework Optuna: A next-generation hyperparameter optimization framework.\BBCQ \BIn \APACrefbtitleProceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Proceedings of the 25rd ACM SIGKDD international conference on knowledge discovery and data mining. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2011. \BBOQ\APACrefatitleAlgorithms for hyper-parameter optimization Algorithms for hyper-parameter optimization.\BBCQ \APACjournalVolNumPagesAdvances in neural information processing systems24. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2012. \BBOQ\APACrefatitleRandom search for hyper-parameter optimization Random search for hyper-parameter optimization.\BBCQ \APACjournalVolNumPagesThe Journal of Machine Learning Research131281–305. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2013. \BBOQ\APACrefatitleMaking a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures.\BBCQ \BIn \APACrefbtitleInternational conference on machine learning International conference on machine learning (\BPGS 115–123). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleHyperparameter optimization: Foundations, algorithms, best practices, and open challenges Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges.\BBCQ \APACjournalVolNumPagesWiley Interdisciplinary Reviews: Data Mining and Knowledge Discoverye1484. \PrintBackRefs\CurrentBib
- \APACrefYear2011. \APACrefbtitleStochastic simulation optimization: an optimal computing budget allocation Stochastic simulation optimization: an optimal computing budget allocation (\BVOL 1). \APACaddressPublisherWorld scientific. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleEMORL: Effective multi-objective reinforcement learning method for hyperparameter optimization Emorl: Effective multi-objective reinforcement learning method for hyperparameter optimization.\BBCQ \APACjournalVolNumPagesEngineering Applications of Artificial Intelligence104104315. \PrintBackRefs\CurrentBib
- \APACinsertmetastarchollet2015keras{APACrefauthors}Chollet, F.\BCBT \BOthersPeriod. \APACrefYearMonthDay2015. \APACrefbtitleKeras. Keras. \APAChowpublishedhttps://keras.io. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \BBOQ\APACrefatitlePyMOSO: software for multiobjective simulation optimization with R-PERLE and R-MinRLE Pymoso: software for multiobjective simulation optimization with r-perle and r-minrle.\BBCQ \APACjournalVolNumPagesINFORMS Journal on Computing3241101–1108. \PrintBackRefs\CurrentBib
- \APACinsertmetastardeng2012mnist{APACrefauthors}Deng, L. \APACrefYearMonthDay2012. \BBOQ\APACrefatitleThe mnist database of handwritten digit images for machine learning research The mnist database of handwritten digit images for machine learning research.\BBCQ \APACjournalVolNumPagesIEEE Signal Processing Magazine296141–142. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay1980. \BBOQ\APACrefatitleFinite Exchangeable Sequences Finite Exchangeable Sequences.\BBCQ \APACjournalVolNumPagesThe Annals of Probability84745 – 764. {APACrefURL} https://doi.org/10.1214/aop/1176994663 {APACrefDOI} 10.1214/aop/1176994663 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleDynamical hyperparameter optimization via deep reinforcement learning in tracking Dynamical hyperparameter optimization via deep reinforcement learning in tracking.\BBCQ \APACjournalVolNumPagesIEEE transactions on pattern analysis and machine intelligence4351515–1529. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2015. \BBOQ\APACrefatitleEfficient benchmarking of hyperparameter optimizers via surrogates Efficient benchmarking of hyperparameter optimizers via surrogates.\BBCQ \BIn \APACrefbtitleProceedings of the AAAI Conference on Artificial Intelligence Proceedings of the aaai conference on artificial intelligence (\BVOL 29). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2018. \BBOQ\APACrefatitleBOHB: Robust and efficient hyperparameter optimization at scale Bohb: Robust and efficient hyperparameter optimization at scale.\BBCQ \BIn \APACrefbtitleInternational Conference on Machine Learning International conference on machine learning (\BPGS 1437–1446). \PrintBackRefs\CurrentBib
- \APACrefYear2006. \APACrefbtitleHandbooks in operations research and management science: simulation Handbooks in operations research and management science: simulation. \APACaddressPublisherElsevier. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay1997. \BBOQ\APACrefatitleLong short-term memory Long short-term memory.\BBCQ \APACjournalVolNumPagesNeural computation981735–1780. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2006. \BBOQ\APACrefatitleDiscrete optimization via simulation using COMPASS Discrete optimization via simulation using compass.\BBCQ \APACjournalVolNumPagesOperations research541115–129. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2011. \BBOQ\APACrefatitleSequential model-based optimization for general algorithm configuration Sequential model-based optimization for general algorithm configuration.\BBCQ \BIn \APACrefbtitleInternational conference on learning and intelligent optimization International conference on learning and intelligent optimization (\BPGS 507–523). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleMultiagent reinforcement learning for hyperparameter optimization of convolutional neural networks Multiagent reinforcement learning for hyperparameter optimization of convolutional neural networks.\BBCQ \APACjournalVolNumPagesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems4141034–1047. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleReinforcement learning for neural architecture search: A review Reinforcement learning for neural architecture search: A review.\BBCQ \APACjournalVolNumPagesImage and Vision Computing8957–66. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleHyp-rl: Hyperparameter optimization by reinforcement learning Hyp-rl: Hyperparameter optimization by reinforcement learning.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1906.11527. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2015. \BBOQ\APACrefatitleMachine learning: Trends, perspectives, and prospects Machine learning: Trends, perspectives, and prospects.\BBCQ \APACjournalVolNumPagesScience3496245255–260. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2022. \BBOQ\APACrefatitleMulti-Objective Hyperparameter Optimization–An Overview Multi-objective hyperparameter optimization–an overview.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2206.07438. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2013. \BBOQ\APACrefatitleAlmost optimal exploration in multi-armed bandits Almost optimal exploration in multi-armed bandits.\BBCQ \BIn \APACrefbtitleInternational Conference on Machine Learning International conference on machine learning (\BPGS 1238–1246). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2001. \BBOQ\APACrefatitleA fully sequential procedure for indifference-zone selection in simulation A fully sequential procedure for indifference-zone selection in simulation.\BBCQ \APACjournalVolNumPagesACM Transactions on Modeling and Computer Simulation (TOMACS)113251–273. \PrintBackRefs\CurrentBib
- \APACrefYear2013. \APACrefbtitleApplied predictive modeling Applied predictive modeling (\BVOL 26). \APACaddressPublisherSpringer. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2005. \BBOQ\APACrefatitleOptimizing hyperparameters of support vector machines by genetic algorithms. Optimizing hyperparameters of support vector machines by genetic algorithms.\BBCQ \BIn \APACrefbtitleIC-AI Ic-ai (\BPGS 74–82). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2017. \BBOQ\APACrefatitleHyperband: A novel bandit-based approach to hyperparameter optimization Hyperband: A novel bandit-based approach to hyperparameter optimization.\BBCQ \APACjournalVolNumPagesThe Journal of Machine Learning Research1816765–6816. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2022. \BBOQ\APACrefatitleA context-based meta-reinforcement learning approach to efficient hyperparameter optimization A context-based meta-reinforcement learning approach to efficient hyperparameter optimization.\BBCQ \APACjournalVolNumPagesNeurocomputing47889–103. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2017. \BBOQ\APACrefatitleParticle Swarm Optimization for Hyper-Parameter Selection in Deep Neural Networks Particle swarm optimization for hyper-parameter selection in deep neural networks.\BBCQ \BIn \APACrefbtitleProceedings of the Genetic and Evolutionary Computation Conference Proceedings of the genetic and evolutionary computation conference (\BPG 481–488). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3071178.3071208 {APACrefDOI} 10.1145/3071178.3071208 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2015. \APACrefbtitleGradient-based Hyperparameter Optimization through Reversible Learning. Gradient-based hyperparameter optimization through reversible learning. \PrintBackRefs\CurrentBib
- \APACinsertmetastarnelson2010optimization{APACrefauthors}Nelson, B\BPBIL. \APACrefYearMonthDay2010. \BBOQ\APACrefatitleOptimization via simulation over discrete decision variables Optimization via simulation over discrete decision variables.\BBCQ \BIn \APACrefbtitleRisk and optimization in an uncertain world Risk and optimization in an uncertain world (\BPGS 193–207). \APACaddressPublisherInforms. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2017. \BBOQ\APACrefatitleEfficient ranking and selection in parallel computing environments Efficient ranking and selection in parallel computing environments.\BBCQ \APACjournalVolNumPagesOperations Research653821–836. \PrintBackRefs\CurrentBib
- \APACinsertmetastaroneillexchangeability{APACrefauthors}O’Neill, B. \APACrefYearMonthDay2009. \BBOQ\APACrefatitleExchangeability, Correlation, and Bayes’ Effect Exchangeability, correlation, and bayes’ effect.\BBCQ \APACjournalVolNumPagesInternational Statistical Review772241-250. {APACrefURL} https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1751-5823.2008.00059.x {APACrefDOI} https://doi.org/10.1111/j.1751-5823.2008.00059.x \PrintBackRefs\CurrentBib
- \APACinsertmetastarpedregosa2016hyperparameter{APACrefauthors}Pedregosa, F. \APACrefYearMonthDay2016. \BBOQ\APACrefatitleHyperparameter optimization with approximate gradient Hyperparameter optimization with approximate gradient.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1602.02355. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2011. \BBOQ\APACrefatitleScikit-learn: Machine Learning in Python Scikit-learn: Machine learning in Python.\BBCQ \APACjournalVolNumPagesJournal of Machine Learning Research122825–2830. \PrintBackRefs\CurrentBib
- \APACinsertmetastarpeng2004model{APACrefauthors}Peng-Wei Chen, Jung-Ying Wang\BCBL \BBA Hahn-Ming Lee. \APACrefYearMonthDay2004. \BBOQ\APACrefatitleModel selection of SVMs using GA approach Model selection of svms using ga approach.\BBCQ \BIn \APACrefbtitle2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541) 2004 ieee international joint conference on neural networks (ieee cat. no.04ch37541) (\BVOL 3, \BPG 2035-2040 vol.3). {APACrefDOI} 10.1109/IJCNN.2004.1380929 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \APACrefbtitleA Note on the Stochastic Ruler Method for Discrete Simulation Optimization. A note on the stochastic ruler method for discrete simulation optimization. \APACaddressPublisherarXiv. {APACrefURL} https://arxiv.org/abs/2010.06909 {APACrefDOI} 10.48550/ARXIV.2010.06909 \PrintBackRefs\CurrentBib
- \APACinsertmetastarRessel1985{APACrefauthors}Ressel, P. \APACrefYearMonthDay1985. \BBOQ\APACrefatitleDe Finetti-type theorems: an analytical approach De finetti-type theorems: an analytical approach.\BBCQ \BIn \APACrefbtitleThe Annals of Probability The annals of probability (\BVOL 13, \BPG 898–922). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleReinforcement learning for hyperparameter tuning in deep learning-based side-channel analysis Reinforcement learning for hyperparameter tuning in deep learning-based side-channel analysis.\BBCQ \APACjournalVolNumPagesIACR Transactions on Cryptographic Hardware and Embedded Systems677–707. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \BBOQ\APACrefatitleMango: A Python Library for Parallel Hyperparameter Tuning Mango: A python library for parallel hyperparameter tuning.\BBCQ \BIn \APACrefbtitleICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Icassp 2020-2020 ieee international conference on acoustics, speech and signal processing (icassp) (\BPGS 3987–3991). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2012. \BBOQ\APACrefatitlePractical bayesian optimization of machine learning algorithms Practical bayesian optimization of machine learning algorithms.\BBCQ \APACjournalVolNumPagesAdvances in neural information processing systems252951–2959. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \BBOQ\APACrefatitleEfficient hyperparameter optimization through model-based reinforcement learning Efficient hyperparameter optimization through model-based reinforcement learning.\BBCQ \APACjournalVolNumPagesNeurocomputing409381–393. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleHyperparameter optimization through context-based meta-reinforcement learning with task-aware representation Hyperparameter optimization through context-based meta-reinforcement learning with task-aware representation.\BBCQ \APACjournalVolNumPagesKnowledge-Based Systems260110160. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2010. \BBOQ\APACrefatitleIndustrial strength COMPASS: A comprehensive algorithm and software for optimization via simulation Industrial strength compass: A comprehensive algorithm and software for optimization via simulation.\BBCQ \APACjournalVolNumPagesACM Transactions on Modeling and Computer Simulation (TOMACS)2011–29. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2013. \BBOQ\APACrefatitleAn adaptive hyperbox algorithm for high-dimensional discrete optimization via simulation problems An adaptive hyperbox algorithm for high-dimensional discrete optimization via simulation problems.\BBCQ \APACjournalVolNumPagesINFORMS Journal on Computing251133–146. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay1992. \BBOQ\APACrefatitleStochastic discrete optimization Stochastic discrete optimization.\BBCQ \APACjournalVolNumPagesSIAM Journal on control and optimization303594–612. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \BBOQ\APACrefatitleOn hyperparameter optimization of machine learning algorithms: Theory and practice On hyperparameter optimization of machine learning algorithms: Theory and practice.\BBCQ \APACjournalVolNumPagesNeurocomputing415295–316. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \BBOQ\APACrefatitleHyper-parameter optimization: A review of algorithms and applications Hyper-parameter optimization: A review of algorithms and applications.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2003.05689. \PrintBackRefs\CurrentBib
- Varun Ramamohan (11 papers)
- Shobhit Singhal (2 papers)
- Aditya Raj Gupta (1 paper)
- Nomesh Bhojkumar Bolia (1 paper)