Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (1612.01474v3)

Published 5 Dec 2016 in stat.ML and cs.LG

Abstract: Deep neural networks (NNs) are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks. Quantifying predictive uncertainty in NNs is a challenging and yet unsolved problem. Bayesian NNs, which learn a distribution over weights, are currently the state-of-the-art for estimating predictive uncertainty; however these require significant modifications to the training procedure and are computationally expensive compared to standard (non-Bayesian) NNs. We propose an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates. Through a series of experiments on classification and regression benchmarks, we demonstrate that our method produces well-calibrated uncertainty estimates which are as good or better than approximate Bayesian NNs. To assess robustness to dataset shift, we evaluate the predictive uncertainty on test examples from known and unknown distributions, and show that our method is able to express higher uncertainty on out-of-distribution examples. We demonstrate the scalability of our method by evaluating predictive uncertainty estimates on ImageNet.

Authors (3)

Balaji Lakshminarayanan (62 papers)
Alexander Pritzel (23 papers)
Charles Blundell (54 papers)

Citations (5,254)

View on Semantic Scholar

Summary

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

The paper "Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles" by Lakshminarayanan, Pritzel, and Blundell proposes an ensemble-based approach to estimate predictive uncertainty in deep neural networks (NNs). Their method addresses significant challenges in NN-based predictions, particularly the issue of overconfident incorrect predictions. The proposed solution leverages the strengths of ensembles while being simpler and more scalable than existing Bayesian methods.

Overview

Deep NNs have achieved impressive performance across various domains but struggle with predictive uncertainty estimation. Bayesian NNs have been the primary solution for uncertainty, learning a distribution over weights and requiring complex training modifications, often prohibitive computationally. This paper presents a non-Bayesian alternative - deep ensembles - that is computationally efficient, easy to implement, and effective in producing high-quality uncertainty estimates.

Methodology

The proposed method integrates three core components:

Proper Scoring Rules: Training NNs using scoring rules such as log-likelihood and the Brier score ensures well-calibrated predictions.
Adversarial Training: Derived from the fast gradient sign method, adversarial training smooths the predictive distributions by generating adversarial examples during training.
Ensembles: Multiple NNs with random initialization are trained independently, and their predictions are combined to form a robust predictive distribution. The diversity from this ensemble mitigates overfitting to specific data points and enhances uncertainty estimates.

Experimental Results

The authors validate their method through extensive experiments on both classification and regression benchmarks. Key findings include:

On toy regression datasets, learning predictive variance through proper scoring rules and ensemble combinations produced better uncertainty estimates than empirical variance from ensembles trained with MSE.
On standard regression benchmarks, the method often outperformed or matched state-of-the-art Bayesian techniques, particularly in NLL.
For classification tasks (e.g., MNIST, SVHN, ImageNet), deep ensembles significantly outperformed MC-dropout in terms of accuracy, NLL, and Brier score.
The method demonstrated robustness to dataset shifts, producing higher uncertainty for out-of-distribution examples from datasets like NotMNIST and CIFAR-10.

Implications and Future Directions

The practical implications of this work are substantial. By providing a scalable and simple method to estimate predictive uncertainty, it facilitates the deployment of NNs in real-world applications where accurate uncertainty measures are critical, such as medical diagnosis and autonomous driving. The method's compatibility with large-scale data and distributed computation makes it attractive for practical use-cases.

Theoretically, this work challenges the prevailing dominance of Bayesian approaches in uncertainty estimation by showcasing a powerful non-Bayesian alternative. It opens avenues for further exploration of ensemble-based methods and hybrid approaches that blend Bayesian and non-Bayesian techniques.

Future research directions include:

Enhancing ensemble diversity through techniques beyond random initialization.
Investigating the combination of ensemble methods with advanced training techniques like distillation or stochastic weight averaging.
Extending the methodology to semi-supervised learning settings and exploring other forms of adversarial training for improved generalization.

In conclusion, this paper presents a compelling case for deep ensembles as a practical, efficient, and effective method for predictive uncertainty estimation. Its strong empirical performance and simplicity are likely to inspire further research and application in the field of neural networks and AI.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos