Non-Parametric Outlier Synthesis (2303.02966v1)

Published 6 Mar 2023 in cs.LG and cs.AI

Abstract: Out-of-distribution (OOD) detection is indispensable for safely deploying machine learning models in the wild. One of the key challenges is that models lack supervision signals from unknown data, and as a result, can produce overconfident predictions on OOD data. Recent work on outlier synthesis modeled the feature space as parametric Gaussian distribution, a strong and restrictive assumption that might not hold in reality. In this paper, we propose a novel framework, Non-Parametric Outlier Synthesis (NPOS), which generates artificial OOD training data and facilitates learning a reliable decision boundary between ID and OOD data. Importantly, our proposed synthesis approach does not make any distributional assumption on the ID embeddings, thereby offering strong flexibility and generality. We show that our synthesis approach can be mathematically interpreted as a rejection sampling framework. Extensive experiments show that NPOS can achieve superior OOD detection performance, outperforming the competitive rivals by a significant margin. Code is publicly available at https://github.com/deeplearning-wisc/npos.

Authors (4)

Leitian Tao (7 papers)
Xuefeng Du (26 papers)
Xiaojin Zhu (59 papers)
Yixuan Li (183 papers)

Citations (74)

View on Semantic Scholar

Summary

Non-parametric Outlier Synthesis in Machine Learning

The paper "Non-parametric Outlier Synthesis" introduces a novel framework for out-of-distribution (OOD) detection, a critical task for the safe deployment of machine learning models in real-world scenarios. Traditional models often fail to identify OOD samples, as they tend to produce overconfident predictions on data they were not trained on. This paper proposes an innovative approach termed Non-Parametric Outlier Synthesis (NPOS), designed to address these shortcomings without relying on restrictive assumptions about data distribution.

Core Methodology

The NPOS framework is structured to differentiate between in-distribution (ID) and OOD data effectively, leveraging a non-parametric approach that does not presume the ID embeddings conform to a specific distribution, such as Gaussian. The synthesis of artificial OOD samples is achieved through a non-parametric density estimation method using nearest neighbor distances to detect boundary points in the feature space.

Rejection Sampling for Synthetic OOD Generation

NPOS employs a rejection sampling method that synthesizes OOD data by targeting boundary ID embeddings. These boundary embeddings, chosen based on their large nearest neighbor distances, are used to generate OOD samples by sampling from a Gaussian kernel centered on these embeddings. The rejection criteria involve retaining only those synthetic samples exhibiting low density likelihood, thereby creating a clear delineation between ID and OOD data.

This method allows the dynamic generation of virtual outliers essential for training robust models capable of recognizing OOD data during deployment. The process is computationally efficient and hinges on the quality of ID embeddings.

Training Framework

The training regimen of NPOS integrates two primary objectives: optimizing ID classification and embedding, while simultaneously estimating and differentiating OOD data through a newly designed uncertainty loss. This dual-objective approach ensures that the decision boundary learned by the model is compact and well-defined, reducing the possibility of overconfidence on OOD inputs.

Empirical Validation

Comprehensive experiments on datasets of various scales, including CIFAR-10, CIFAR-100, and ImageNet-100, demonstrate the NPOS model's superior performance in OOD detection. Notably, when fine-tuning on pre-trained models such as CLIP, NPOS consistently reduced the false positive rate—achieving notable improvements over existing methods like VOS, which relies on parametric assumptions.

Implications and Future Directions

The non-parametric nature of this framework allows it to generalize effectively across different datasets and model architectures, offering a significant leap in the flexibility and applicability of OOD detection systems. The ability to operate without distributional assumptions provides robustness in complex, real-world settings where data distributions can be highly variable.

Looking forward, the simplicity and efficacy of the NPOS method suggest several avenues for further exploration:

Extension to Multi-modal Data: Expanding the framework to incorporate multi-modal inputs such as video and text may enhance the robustness of OOD detection in more diverse applications.
Integration with Other Learning Paradigms: Combining NPOS with semi-supervised and unsupervised learning techniques could further enhance its utility and performance.
Scalability to Larger Architectures: Investigating its integration with larger and more advanced model architectures may offer benefits in even more challenging OOD scenarios.

Conclusion

NPOS presents a compelling advancement in the pursuit of reliable and safe deployment of machine learning models beyond the confines of their training data. By eschewing restrictive distributional assumptions, it establishes a new benchmark for flexibility and accuracy in OOD detection, paving the way for future innovation and adoption in various domains.

PDF Markdown