A Geometric Explanation of the Likelihood OOD Detection Paradox

Published 27 Mar 2024 in cs.LG, cs.AI, cs.CV, and stat.ML | (2403.18910v2)

Abstract: Likelihood-based deep generative models (DGMs) commonly exhibit a puzzling behaviour: when trained on a relatively complex dataset, they assign higher likelihood values to out-of-distribution (OOD) data from simpler sources. Adding to the mystery, OOD samples are never generated by these DGMs despite having higher likelihoods. This two-pronged paradox has yet to be conclusively explained, making likelihood-based OOD detection unreliable. Our primary observation is that high-likelihood regions will not be generated if they contain minimal probability mass. We demonstrate how this seeming contradiction of large densities yet low probability mass can occur around data confined to low-dimensional manifolds. We also show that this scenario can be identified through local intrinsic dimension (LID) estimation, and propose a method for OOD detection which pairs the likelihoods and LID estimates obtained from a pre-trained DGM. Our method can be applied to normalizing flows and score-based diffusion models, and obtains results which match or surpass state-of-the-art OOD detection benchmarks using the same DGM backbones. Our code is available at https://github.com/layer6ai-labs/dgm_ood_detection.

Abstract PDF HTML Upgrade to Chat

Authors (6)

References (86)

Citations (6)

View on Semantic Scholar

Summary

The paper demonstrates that DGMs assign unexpectedly high likelihoods to simpler OOD data due to low probability mass on low-dimensional manifolds.
It introduces local intrinsic dimension (LID) estimation as a method to quantify density and guide a dual threshold for effective OOD detection.
Experimental results show a significant AUC-ROC improvement from 0.070 to 0.953, proving the method's effectiveness in challenging FMNIST vs. MNIST scenarios.

A Geometric Explanation of the Likelihood OOD Detection Paradox

Introduction

The paper explores the perplexing behavior exhibited by likelihood-based deep generative models (DGMs) in the context of out-of-distribution (OOD) detection. Specifically, these models, when trained on complex datasets, tend to assign higher likelihoods to OOD data from simpler datasets. This paradox arises despite the fact that DGMs do not generate samples from these high-likelihood regions. The paper proposes a geometric explanation, suggesting that high likelihoods can coincide with low probability mass due to the data's confinement to low-dimensional manifolds. The authors introduce local intrinsic dimension (LID) estimation as a means to detect this scenario and propose a method for OOD detection, leveraging the combination of likelihoods and LID estimates.

Methodology

Likelihood Behavior in DGMs

The authors begin by highlighting the unusual trend where trained DGMs assign higher likelihoods to simpler, OOD datasets compared to more complex, in-distribution datasets. This observation is coupled with the fact that DGMs generate samples that visually appear similar to the training data, never generating the seemingly high-likelihood OOD samples. The paper posits that this behavior can occur when OOD data resides in regions of low probability mass.

Figure 1: FMNIST-trained DM vs.\ MNIST.

Relationship Between LID and Probability Mass

The authors explore the relationship between local intrinsic dimension and contiguous volume, establishing that LID serves as a proxy for how densely a model distributes probability mass around a given point. The volume assigned around a point in low-dimensional space is smaller than in higher-dimensional regions, allowing high densities without substantial probability mass. The paper illustrates this concept using Gaussian convolutions and other mathematical formulations, showing empirically that the intrinsic dimension can be effectively captured by the rank of specific matrices.

Dual Threshold OOD Detection

To address the paradox, the paper proposes a dual threshold method for OOD detection. This approach involves classifying a data point as OOD if its likelihood is low or if, despite being high, its LID is small. Conversely, samples are classified as in-distribution only if both likelihood and LID are high.

Experiments

Validation of LID-Based Detection

Experiments are conducted on various pairs of datasets to evaluate the effectiveness of the proposed method. The experimental results demonstrate significant improvements when leveraging LID in conjunction with likelihoods for OOD detection compared to using likelihoods alone. In key pathological scenarios such as FMNIST vs. MNIST, results show AUC-ROC improvements, showcasing the potential of the dual threshold method.

Figure 2: ~FMNIST~vs.~MNIST:\AUC-ROC boost (0.070 \to 0.953)

Conclusion

The paper presents a compelling geometric explanation for the likelihood-based OOD detection paradox and provides a practical method to address it by using LID estimates alongside likelihood evaluations. This dual approach consistently enhances OOD detection performance across diverse datasets, proving more resilient than traditional single threshold techniques. The study opens avenues for further exploration into the geometry of data manifolds and their impact on density estimation, providing a robust framework for understanding and mitigating likelihood pathologies in DGMs.

Markdown Report Issue