A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection (2106.09022v1)

Published 16 Jun 2021 in cs.LG

Abstract: Mahalanobis distance (MD) is a simple and popular post-processing method for detecting out-of-distribution (OOD) inputs in neural networks. We analyze its failure modes for near-OOD detection and propose a simple fix called relative Mahalanobis distance (RMD) which improves performance and is more robust to hyperparameter choice. On a wide selection of challenging vision, language, and biology OOD benchmarks (CIFAR-100 vs CIFAR-10, CLINC OOD intent detection, Genomics OOD), we show that RMD meaningfully improves upon MD performance (by up to 15% AUROC on genomics OOD).

Citations (187)

View on Semantic Scholar

Summary

The paper proposes RMD, a modification of the traditional Mahalanobis Distance that normalizes distances using a class-independent Gaussian reference to improve near-OOD detection.
It employs eigen-analysis to mitigate the influence of non-discriminative features, yielding up to a 15% improvement in AUROC on genomics benchmarks.
RMD offers a hyperparameter-free, easily integrated solution that enhances OOD detection in both from-scratch and pre-trained models for sensitive real-world applications.

A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection

The paper under review introduces a novel modification to the Mahalanobis Distance (MD) to enhance out-of-distribution (OOD) detection, specifically targeting cases where the OOD inputs are semantically similar to the in-distribution data, referred to as "near-OOD" scenarios. The proposed technique, termed Relative Mahalanobis Distance (RMD), addresses the limitations of the traditional MD approach by incorporating a class-independent Gaussian distribution as a reference, effectively enhancing detection robustness and performance.

Background and Motivation

OOD detection is crucial for deploying machine learning models in real-world applications, especially those demanding high reliability and safety. While numerous advanced methodologies have been explored, including generative models and modified training objectives, these approaches often require retraining models or access to OOD examples, which imposes practical constraints. MD-based methods, known for their simplicity and efficacy in detecting far-OOD samples, falter in near-OOD cases due to their sensitivity to feature dimensions that do not significantly differ between in-distribution (IND) and OOD data.

Methodological Advances

The paper meticulously examines the failure modes of MD for near-OOD detection. It conducts an eigen-analysis to elucidate how MD disproportionately weights feature dimensions, which do not provide meaningful discrimination between IND and OOD samples. RMD mitigates this issue by normalizing Mahalanobis distances relative to a background distribution, which is fit to the entire dataset irrespective of class labels. This strategy effectively diminishes contributions from non-discriminative features, enhancing the method's focus on dimensions that matter for OOD detection.

Experimental Insights

A comprehensive suite of experiments validates the efficacy of the proposed RMD method. On challenging benchmarks across vision, language, and genomics domains, RMD consistently outperforms traditional MD and maximum softmax probability (MSP), with notable improvements such as up to a 15% increase in AUROC on genomics OOD tasks.

Without Pre-training: When models are trained from scratch, RMD provides significant gains over MD and MSP, underscoring its utility in extracting meaningful features when pre-training is not employed.
With Pre-training: Leveraging models pre-trained on large datasets, such as Vision Transformer (ViT), BiT, and CLIP, RMD demonstrates robust improvements in OOD detection performance. These results indicate that high-quality features obtained from pre-training are beneficial for RMD.

Implications and Future Directions

The findings present RMD as a straightforward, hyperparameter-free enhancement of MD that significantly boosts near-OOD detection capabilities without requiring model retraining. By eliminating the detrimental impact of non-discriminative features, RMD shows promise in further elevating the reliability of deployments in sensitive applications.

From a theoretical perspective, the RMD methodology opens up new avenues in the exploration of feature-based normalization approaches across various OOD detection tasks. Practically, its hyperparameter-free nature and ease of integration make RMD a compelling choice for real-world applications, particularly where model retraining is impractical.

Conclusion

The introduction of Relative Mahalanobis Distance marks a substantial step forward in addressing the shortcomings of standard MD in near-OOD scenarios. This paper provides a clear demonstration of RMD's advantages, supported by robust empirical results across multiple domains. Future developments may explore extending RMD's framework beyond Gaussian assumptions or integrating it with other generative models to enhance OOD detection capabilities further.

PDF Markdown

Related Papers

GitHub

GitHub - google/uncertainty-baselines: High-quality implementations of standard and SOTA methods on a variety of tasks. (1,384 stars)

Tweets

https://twitter.com/balajiln/status/1388273084637929475

https://twitter.com/dustinvtran/status/1390364569281843200

https://twitter.com/TheTuringPost/status/1596523965656096768

https://twitter.com/oxcsml/status/1468346868023369729

https://twitter.com/MLRepositories/status/1581057513574649857

https://twitter.com/dustinvtran/status/1336008403789889536