Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples (1711.09325v3)

Published 26 Nov 2017 in stat.ML and cs.LG

Abstract: The problem of detecting whether a test sample is from in-distribution (i.e., training distribution by a classifier) or out-of-distribution sufficiently different from it arises in many real-world machine learning applications. However, the state-of-art deep neural networks are known to be highly overconfident in their predictions, i.e., do not distinguish in- and out-of-distributions. Recently, to handle this issue, several threshold-based detectors have been proposed given pre-trained neural classifiers. However, the performance of prior works highly depends on how to train the classifiers since they only focus on improving inference procedures. In this paper, we develop a novel training method for classifiers so that such inference algorithms can work better. In particular, we suggest two additional terms added to the original loss (e.g., cross entropy). The first one forces samples from out-of-distribution less confident by the classifier and the second one is for (implicitly) generating most effective training samples for the first one. In essence, our method jointly trains both classification and generative neural networks for out-of-distribution. We demonstrate its effectiveness using deep convolutional neural networks on various popular image datasets.

Citations (834)

View on Semantic Scholar

Summary

The paper introduces a training method that combines additional loss terms with GAN-generated samples to reduce overconfidence on OOD instances.
The approach significantly improves OOD detection performance, achieving near-perfect AUROC and AUPR on benchmarks like CIFAR-10 and SVHN.
The method enhances AI safety by maintaining high in-distribution accuracy while reliably distinguishing between familiar and novel inputs.

Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples

Abstract and Motivation

Deep neural networks (DNNs) have demonstrated exceptional performance across various classification tasks, including speech recognition, image classification, video prediction, and medical diagnosis. Despite their high accuracy, DNNs are known to exhibit overconfident predictions even for out-of-distribution (OOD) samples, posing significant challenges in applications requiring reliable uncertainty estimates and AI safety. Previous work has primarily focused on developing threshold-based OOD detectors that leverage pre-trained classifiers. However, these detectors' performance is highly dependent on the classifiers' ability to separate in- and out-of-distribution samples effectively. This paper introduces a novel training method aimed at improving classifiers' performance in OOD detection while maintaining their classification accuracy.

Methodology

The proposed method incorporates two additional loss terms into the standard cross entropy loss during training. The first term minimizes the Kullback-Leibler (KL) divergence between the predictive distribution on OOD samples and a uniform distribution, thereby reducing the classifier's confidence on these samples. The second term employs a generative adversarial network (GAN) to generate effective training samples for OOD detection. The overall training process involves jointly optimizing the classification and GAN models, which enables the classifier to produce more reliable uncertainty estimates.

Loss Function Design

The new loss function, termed confidence loss, is defined as follows:

1 2	min_θ E_{P_in(𝒙, y)}[ - log P_θ(y\|𝒙) ] + β E_{P_out(𝒙)}[ KL(𝒰(y) ∥ P_θ(y\|𝒙)) ]

where θ represents the model parameters, P_in denotes the in-distribution, P_out denotes the OOD samples, β is a penalty parameter, and 𝒰(y) is the uniform distribution. The KL divergence term ensures that the classifier assigns low confidence to OOD samples, enhancing the separability between in- and out-distributions.

Generative Adversarial Network for OOD Sample Generation

To address the problem of generating effective OOD samples, the authors propose a new GAN architecture. Unlike traditional GANs, which generate samples resembling the in-distribution, the proposed GAN generates samples in the low-density regions near the boundary of the in-distribution. This approach is motivated by the insight that OOD samples close to the in-distribution in the feature space are more effective for training the classifier to distinguish between in- and out-distributions. The GAN training objective includes a term that minimizes the KL divergence between the generated samples' predictive distribution and the uniform distribution, along with the original GAN objective.

Experimental Results

The proposed method is evaluated using deep convolutional neural networks, such as AlexNet and VGGNet, on image classification tasks involving the CIFAR-10, SVHN, ImageNet, and LSUN datasets. The results demonstrate that classifiers trained using the proposed method significantly improve the detection performance of threshold-based detectors across all experiments. Specifically, the VGGNet trained using this method shows almost perfect detection performance on CIFAR-10 and SVHN datasets.

Numerical Results

The experiments highlight the effectiveness of the proposed training method in OOD detection. For instance, a VGGNet trained with the proposed method on CIFAR-10 achieves:

Detection accuracy of 99.9%,
AUROC of 100.0%,
AUPR (in) of 100.0%,
AUPR (out) of 99.9%.

Similarly, on the SVHN dataset, the classifier achieves comparable results, demonstrating robustness across different in-distributions.

Implications and Future Work

The proposed training method significantly enhances neural networks' ability to detect OOD samples without compromising their classification accuracy. This advancement has practical implications for deploying AI systems in safety-critical applications, such as medical diagnostics and autonomous systems, where reliable uncertainty estimates are crucial.

Future work could explore integrating this training method with Bayesian probabilistic models and ensemble techniques to further improve OOD detection performance. Additionally, extending the approach to other tasks, such as regression and network calibration, could provide further insights into its generalizability and utility in various machine learning domains.

Conclusion

This paper presents a novel training method that integrates a confidence loss framework with a GAN-based OOD sample generator, effectively addressing the overconfidence issue in DNNs for OOD detection. The experimental results underscore the method's efficacy, offering a robust solution for enhancing AI safety and reliability in real-world applications.

PDF Markdown

Related Papers

GitHub

GitHub - alinlab/Confident_classifier: Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018 (181 stars)