Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 54 tok/s

Gemini 2.5 Pro 50 tok/s Pro

GPT-5 Medium 18 tok/s Pro

GPT-5 High 31 tok/s Pro

GPT-4o 105 tok/s Pro

Kimi K2 182 tok/s Pro

GPT OSS 120B 466 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Put CASH on Bandits: A Max K-Armed Problem for Automated Machine Learning (2505.05226v1)

Published 8 May 2025 in cs.LG and cs.AI

Abstract: The Combined Algorithm Selection and Hyperparameter optimization (CASH) is a challenging resource allocation problem in the field of AutoML. We propose MaxUCB, a max $k$-armed bandit method to trade off exploring different model classes and conducting hyperparameter optimization. MaxUCB is specifically designed for the light-tailed and bounded reward distributions arising in this setting and, thus, provides an efficient alternative compared to classic max $k$-armed bandit methods assuming heavy-tailed reward distributions. We theoretically and empirically evaluate our method on four standard AutoML benchmarks, demonstrating superior performance over prior approaches.

Collections

Summary

Overview of Max K-Armed Problem for Automated Machine Learning

The paper discusses a novel approach to solving the CASH (Combined Algorithm Selection and Hyperparameter optimization) problem in AutoML, a significant challenge due to the intricate balancing act of exploring different model classes and their hyperparameters while remaining computationally efficient. The authors propose an algorithm named MaxUCB, which is designed to address the shortcomings of existing max $k$ -armed bandit algorithms, particularly in contexts where reward distributions have distinct characteristics, namely being light-tailed and bounded.

Key Contributions and Findings

MaxUCB targets the CASH problem's specific peculiarities, where model performances are not only heterogeneous across classes but also follow bounded and negatively skewed reward distributions. These characteristics deviate from the typical assumption of heavy-tailed distributions prevalent in conventional multi-armed bandit algorithms used for similar optimization tasks. The authors introduce a newly tailored exploration-exploitation strategy based on adjusted statistical assumptions derived from empirical analyses. Notably, the algorithm emphasizes maximizing the observed maximum reward, a requirement unique to CASH that doesn't align naturally with seeking high average rewards as in traditional bandit approaches.

Strong numerical results validate the superiority of MaxUCB compared to prior methods, demonstrated through empirical evaluations across four benchmark AutoML datasets. The proposed algorithm's ability to dynamically allocate computing resources leads to consistently better performance over existing bandit strategies and combined search methods.

Theoretical and Empirical Implications

From a theoretical standpoint, MaxUCB leverages the bounded nature of reward distributions, providing more fitting high-probability regret bounds. The paper argues that, under the established assumptions, sample efficiency is significantly enhanced, reducing exploratory overhead and leading to sublinear growth in suboptimal trials. These findings resolve certain open questions regarding the applicability of extreme bandits to hyperparameter optimization, particularly concerning their usual parametric assumptions.

Empirically, the success of MaxUCB in identifying optimal model classes and hyperparameters more swiftly than combined search indicates its practical utility in real-world AutoML systems. The robustness in performance, despite variations in dataset complexity and available computational resources, demonstrates its adaptability.

Future Directions and Considerations

The paper suggests several promising avenues for further refinement of the approach. Adaptive tuning of the exploration parameter, $\alpha$ , based on problem-specific data characteristics, presents an opportunity for improved precision. Additionally, incorporating mechanisms to handle non-stationary reward distributions could address scenarios where data or model performance evolve continuously over time.

The methodology outlined in this paper potentially extends beyond CASH problems in AutoML, offering insights into hierarchical search problems and similar settings where heterogeneous model evaluations are central. Future investigation into multi-fidelity optimization, leveraging trends in data streams, and cost-aware approaches could yield even more efficient solutions for complex ML pipeline settings.

In summary, MaxUCB is a step forward in tackling the complexities of automated machine learning optimization, notably by considering distinctive distribution phenomena in the reward landscape. The findings signify meaningful advances in both theoretical understanding and practical application within the domain.