Deep Deterministic Uncertainty: A Simple Baseline (2102.11582v3)

Published 23 Feb 2021 in cs.LG and stat.ML

Abstract: Reliable uncertainty from deterministic single-forward pass models is sought after because conventional methods of uncertainty quantification are computationally expensive. We take two complex single-forward-pass uncertainty approaches, DUQ and SNGP, and examine whether they mainly rely on a well-regularized feature space. Crucially, without using their more complex methods for estimating uncertainty, a single softmax neural net with such a feature-space, achieved via residual connections and spectral normalization, outperforms DUQ and SNGP's epistemic uncertainty predictions using simple Gaussian Discriminant Analysis post-training as a separate feature-space density estimator -- without fine-tuning on OoD data, feature ensembling, or input pre-procressing. This conceptually simple Deep Deterministic Uncertainty (DDU) baseline can also be used to disentangle aleatoric and epistemic uncertainty and performs as well as Deep Ensembles, the state-of-the art for uncertainty prediction, on several OoD benchmarks (CIFAR-10/100 vs SVHN/Tiny-ImageNet, ImageNet vs ImageNet-O) as well as in active learning settings across different model architectures, yet is computationally cheaper.

Authors (5)

Jishnu Mukhoti (10 papers)
Andreas Kirsch (30 papers)
Joost van Amersfoort (17 papers)
Philip H. S. Torr (219 papers)
Yarin Gal (170 papers)

Citations (112)

View on Semantic Scholar

Summary

Deep Deterministic Uncertainty: A Conceptually Simple Approach to Uncertainty Quantification

Uncertainty quantification in machine learning, particularly in neural networks, is a crucial aspect for improving model reliability, especially in applications requiring safety and trust, such as autonomous driving and medical diagnostics. The paper "Deep Deterministic Uncertainty: A Simple Baseline" investigates techniques for estimating uncertainty from deterministic single-forward-pass models, offering a simpler and computationally cheaper alternative compared to ensembles, without compromising performance.

Two Types of Uncertainty

Model uncertainty can broadly be divided into two types: epistemic and aleatoric uncertainty. Epistemic uncertainty arises due to a lack of training data and is reducible with additional data. Conversely, aleatoric uncertainty results from inherent noise or ambiguity in the data and remains irreducible regardless of the additional data provided. In practice, discerning these uncertainties is vital for applications such as active learning and out-of-distribution (OoD) detection, where the former demands focusing on inputs with potential information gain, avoiding ambiguous data points, and the latter necessitates reliable identification of data points not conforming to the training distribution.

Prior Works and Their Complexities

Traditional approaches to uncertainty quantification, such as Bayesian neural networks and deep ensembles, are computationally expensive, owing to their reliance on multiple forward passes or training multiple models. Notably, while Deep Ensembles stand out for their performance in uncertainty prediction, they come with significant computational overheads, hindering general applicability in real-time scenarios.

Recent works focusing on single forward-pass models—DUQ and SNGP—introduce complex architectures and training regimes. These models incorporate specific output layers, such as radial basis functions (RBFs) or Gaussian processes (GPs), alongside feature-space regularization techniques like spectral normalization and Jacobian penalties. Although the resulting uncertainty estimations from these models are competitive, they necessitate modifications to the base model architecture and often fail to separate aleatoric and epistemic uncertainties effectively.

The Simplicity of Deep Deterministic Uncertainty (DDU)

The authors propose a more straightforward alternative termed Deep Deterministic Uncertainty (DDU). This method employs a consistent feature-space regularization through spectral normalization within models featuring residual connections, such as ResNets. Post-training, the aleatoric uncertainty is determined via the softmax entropy, which can be calibrated further. Meanwhile, epistemic uncertainty is quantified using Gaussian Discriminant Analysis (GDA) fitted to the feature space representation after training, eliminating the need for input pre-processing, feature-space ensembling, or retraining with OoD data as seen in previous approaches.

Empirical Results

The effectiveness of the DDU approach is demonstrated across multiple tasks. In active learning scenarios using datasets like Dirty-MNIST, DDU reliably separates epistemic uncertainty and performs on par with deep ensembles, even outperforming them in scenarios with high aleatoric uncertainty. In various OoD detection benchmarks, such as CIFAR-10 vs. SVHN and ImageNet vs. ImageNet-O, DDU exhibits superior performance compared to existing deterministic single-pass models and matches that of ensembles, validating the predictive strength of feature-space density estimators under adequate feature-space regularization.

Technical Insights

A key insight from using DDU is the intuitive distinction between aleatoric uncertainty via predictive probabilities and epistemic uncertainty via feature-space density. This separation resolves several pitfalls associated with predictive entropy, which often confounds the two uncertainties. Moreover, the deployment of spectral normalization ensures sensitivity and smoothness in feature extraction, counteracting feature collapse often observed in previous works without explicit regularization.

Conclusion and Future Directions

DDU embodies a paradigm shift towards more accessible and computationally efficient uncertainty quantification methodologies in neural networks. By simplifying the architecture and post-training adjustments needed to obtain accurate uncertainty metrics, DDU holds promise for widespread application in dynamic and resource-constrained environments. Future work might explore the extension of these strategies to broader model classes beyond residual networks and investigate improved density estimation techniques that further enhance feature-space representations without complex training prerequisites. As AI models continue evolving, ensuring robust uncertainty estimation will be pivotal in fostering trust and reliability across critical applications.

Related Papers

Tweets

https://twitter.com/BlackHC/status/1646601273192468492