HyperArm Bandit Optimization: A Novel approach to Hyperparameter Optimization and an Analysis of Bandit Algorithms in Stochastic and Adversarial Settings

Published 13 Mar 2025 in cs.LG | (2503.10282v1)

Abstract: This paper explores the application of bandit algorithms in both stochastic and adversarial settings, with a focus on theoretical analysis and practical applications. The study begins by introducing bandit problems, distinguishing between stochastic and adversarial variants, and examining key algorithms such as Explore-Then-Commit (ETC), Upper Confidence Bound (UCB), and Exponential-Weight Algorithm for Exploration and Exploitation (EXP3). Theoretical regret bounds are analyzed to compare the performance of these algorithms. The paper then introduces a novel framework, HyperArm Bandit Optimization (HABO), which applies EXP3 to hyperparameter tuning in machine learning models. Unlike traditional methods that treat entire configurations as arms, HABO treats individual hyperparameters as super-arms, and its potential configurations as sub-arms, enabling dynamic resource allocation and efficient exploration. Experimental results demonstrate HABO's effectiveness in classification and regression tasks, outperforming Bayesian Optimization in terms of computational efficiency and accuracy. The paper concludes with insights into the convergence guarantees of HABO and its potential for scalable and robust hyperparameter optimization.