Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Published 24 Jun 2020 in cs.LG and stat.ML | (2006.13570v3)

Abstract: Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For best performance independent of budget, we propose hyper-deep ensembles, a simple procedure that involves a random search over different hyperparameters, themselves stratified across multiple random initializations. Its strong performance highlights the benefit of combining models with both weight and hyperparameter diversity. We further propose a parameter efficient version, hyper-batch ensembles, which builds on the layer structure of batch ensembles and self-tuning networks. The computational and memory costs of our method are notably lower than typical ensembles. On image classification tasks, with MLP, LeNet, ResNet 20 and Wide ResNet 28-10 architectures, we improve upon both deep and batch ensembles.

Abstract PDF Upgrade to Chat

Citations (194)

View on Semantic Scholar

Summary

The paper introduces hyperparameter ensembles that enhance model accuracy and calibration by combining hyperparameter diversity with multiple random initializations.
It leverages a random search across hyperparameters to create hyper-deep and hyper-batch ensembles, yielding superior performance on image classification tasks.
The approach reduces computational burden while improving uncertainty quantification, offering practical benefits for applications like autonomous systems and medical diagnostics.

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Ensembles composed of neural networks initialized with diverse seeds have established state-of-the-art performance in terms of accuracy and calibration, typically referred to as deep ensembles. Motivated by efficiency, batch ensembles provide a parameter-efficient alternative but are constrained by their architecture. The research presented in this paper introduces a novel paradigm: ensembles over hyperparameters, aiming to enhance performance irrespective of resource constraints.

Overview of Proposed Methods

The research develops two main variants of hyper-hyperparameter ensembles: hyper-deep ensembles and hyper-batch ensembles. Hyper-deep ensembles employ a sophisticated method leveraging hyperparameters alongside diverse initialization, achieving improvements over both deep and batch ensembles. The process initiates with a random search across hyperparameters, aligning multiple random initializations and yielding substantial performance benefits. The parameter-efficient counterpart, hyper-batch ensembles, builds on batch ensembles by incorporating a structure that supports both weight and hyperparameter diversity, significantly reducing computational and memory demands.

Experimental Findings

Employing various architectures, including MLP, LeNet, ResNet 20, and Wide ResNet 28-10, the presented methods are validated across image classification tasks (e.g., CIFAR-10, CIFAR-100). Hyper-deep ensembles consistently demonstrate superior performance over standard deep ensembles, highlighted by enhanced classification accuracy and improved calibration as measured by negative log-likelihood (NLL) and expected calibration error (ECE). Critically, hyper-batch ensembles not only outperform batch ensembles but achieve these results with observably reduced computational burden.

Implications and Future Directions

The introduction of hyperparameter diversity through ensembling marks a notable advancement in constructing neural ensembles for robust and reliable predictions. The research underscores the importance of integrating both parameter-derived and hyperparameter-induced variability. Practically, these findings suggest significant improvements in fields relying on predictive uncertainties like autonomous systems or medical diagnostics. Theoretically, the work enriches understanding of ensemble diversity's impact on model performance and uncertainty estimates.

Future developments may explore more compact hyperparameter tuning methods, especially in scenarios constrained by memory or processing power. There's also significant potential in extending structural design diversity, such as variable architectural depth or node count, facilitating neural network ensembles that embrace a broader spectrum of hyperparameters.

In conclusion, hyperparameter ensembles represent a potent method in neural architecture, demonstrating robustness and efficiency without sacrificing performance—a key milestone towards more dependable machine learning models.

Markdown