Posterior Network: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts (2006.09239v2)

Published 16 Jun 2020 in cs.LG and stat.ML

Abstract: Accurate estimation of aleatoric and epistemic uncertainty is crucial to build safe and reliable systems. Traditional approaches, such as dropout and ensemble methods, estimate uncertainty by sampling probability predictions from different submodels, which leads to slow uncertainty estimation at inference time. Recent works address this drawback by directly predicting parameters of prior distributions over the probability predictions with a neural network. While this approach has demonstrated accurate uncertainty estimation, it requires defining arbitrary target parameters for in-distribution data and makes the unrealistic assumption that out-of-distribution (OOD) data is known at training time. In this work we propose the Posterior Network (PostNet), which uses Normalizing Flows to predict an individual closed-form posterior distribution over predicted probabilites for any input sample. The posterior distributions learned by PostNet accurately reflect uncertainty for in- and out-of-distribution data -- without requiring access to OOD data at training time. PostNet achieves state-of-the art results in OOD detection and in uncertainty calibration under dataset shifts.

Citations (153)

View on Semantic Scholar

Summary

The paper introduces the Posterior Network (PostNet), a framework that estimates uncertainty in classification tasks without needing out-of-distribution data during training.
PostNet uses normalizing flows and density-based pseudo-counts to model aleatoric and epistemic uncertainties via a Dirichlet distribution.
PostNet demonstrates state-of-the-art performance in OOD detection, improves calibration using a Bayesian loss, and is computationally efficient compared to sampling methods.

Posterior Network: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts

Introduction

In recent times, the demand for reliable AI systems capable of quantifying their predictive uncertainty has grown significantly across various domains like robotics, finance, and healthcare. Traditional uncertainty estimation techniques, such as dropout and ensemble methods, though effective, often entail high computational costs during inference as they rely on sampling strategies. Alternatively, some modern approaches directly infer distribution parameters to model uncertainties. However, these rely on having access to explicit out-of-distribution (OOD) data during training, a scenario that is rarely feasible in practice.

Contributions

This paper introduces the Posterior Network (PostNet), an innovative framework designed to estimate uncertainty in classification tasks without requiring OOD samples during training. Leveraging normalizing flows, PostNet effectively learns a closed-form posterior distribution over class probability predictions. The model showcases state-of-the-art performance in capturing uncertainties for both in-distribution and OOD data, and in maintaining well-calibrated predictions under dataset shifts.

Methodology

PostNet models both aleatoric and epistemic uncertainties via a Dirichlet distribution:

Aleatoric uncertainty pertains to inherent data randomness (e.g., noise in measurements).
Epistemic uncertainty reflects model confoundment due to lack of knowledge or unseen data.

Architecture:

Latent Space Encoding: A neural network encoder maps inputs to a latent space.
Density Estimation: Employing normalizing flows, PostNet defines class-conditional density functions in the latent space.
Pseudo-Counts: Determined by integrating class densities to match training sample counts, these emulate observations to parameterize Dirichlet distributions.
Bayesian Loss: Driven by a novel Bayesian loss function, the approach aligns Dirichlet parameters with the empirical likelihood of the data, enhanced with an entropy regularization term to ensure model smoothness.

This formulation enables PostNet to delineate the training data density, optimizing epistemic uncertainty by assigning higher confidence to regions with richer class distributions and greater uncertainty to sparse or unseen data regions.

Results

PostNet demonstrates robustness across multiple benchmark datasets (Segment, Sensorless Drive, MNIST, and CIFAR10):

Calibration: The Bayesian loss function notably enhances calibration over Dirichlet-based counterparts.
OOD Detection: Without predefined OOD samples, PostNet notably surpasses other methods in recognizing OOD and domain-shifted data, ensuring adaptive model performance.
Efficiency: By learning explicit epistemic distributions, PostNet mitigates the computational burden typically seen with sampling-based methods like dropout and ensembles.

Implications and Future Work

PostNet enables robust uncertainty quantification, which could impact significantly risk-averse fields seeking dependable AI systems, especially those involving decision-making under uncertainty. Its capacity to learn without OOD data is particularly appealing for applications where such data is scarce or unknown. Future research may explore extending PostNet to more complex models or applying its foundational concepts to unsupervised anomaly detection tasks. Moreover, there is scope to refine normalizing flow models for better scalability or efficiency.

Conclusion

The Posterior Network framework offers a pragmatic solution to estimating predictive uncertainty in machine learning, removing dependencies on OOD data and achieving impressive results in uncertainty calibration. This positions PostNet as a potentially valuable tool in advancing the development of reliable and trustworthy AI systems.

Related Papers

Tweets

https://twitter.com/eric_nalisnick/status/1772635710509510934