SSD: A Unified Framework for Self-Supervised Outlier Detection (2103.12051v1)

Published 22 Mar 2021 in cs.CV, cs.AI, and cs.LG

Abstract: We ask the following question: what training information is required to design an effective outlier/out-of-distribution (OOD) detector, i.e., detecting samples that lie far away from the training distribution? Since unlabeled data is easily accessible for many applications, the most compelling approach is to develop detectors based on only unlabeled in-distribution data. However, we observe that most existing detectors based on unlabeled data perform poorly, often equivalent to a random prediction. In contrast, existing state-of-the-art OOD detectors achieve impressive performance but require access to fine-grained data labels for supervised training. We propose SSD, an outlier detector based on only unlabeled in-distribution data. We use self-supervised representation learning followed by a Mahalanobis distance based detection in the feature space. We demonstrate that SSD outperforms most existing detectors based on unlabeled data by a large margin. Additionally, SSD even achieves performance on par, and sometimes even better, with supervised training based detectors. Finally, we expand our detection framework with two key extensions. First, we formulate few-shot OOD detection, in which the detector has access to only one to five samples from each class of the targeted OOD dataset. Second, we extend our framework to incorporate training data labels, if available. We find that our novel detection framework based on SSD displays enhanced performance with these extensions, and achieves state-of-the-art performance. Our code is publicly available at https://github.com/inspire-group/SSD.

Authors (3)

Vikash Sehwag (33 papers)
Mung Chiang (65 papers)
Prateek Mittal (129 papers)

Citations (295)

View on Semantic Scholar

Summary

The paper introduces SSD, a self-supervised framework that detects out-of-distribution instances using unlabeled in-distribution data.
It leverages contrastive learning and a Mahalanobis distance metric to deliver superior AUROC performance on benchmarks like CIFAR-10 and CIFAR-100.
The few-shot extension, SSDₖ, improves detection with minimal OOD samples, paving the way for practical deployment in real-world scenarios.

An Expert Overview of "SSD: A Unified Framework for Self-Supervised Outlier Detection"

The paper presents a novel outlier detection framework named SSD, designed to effectively identify out-of-distribution (OOD) instances using only unlabeled in-distribution data. This work challenges the prevalent assumption that labels are essential for high-performance OOD detection and proposes a self-supervised learning paradigm instead.

Background and Motivation

Deep neural networks excel in safety-critical tasks but often fail when exposed to OOD data. The conventional OOD detectors that perform well on complex data, particularly image datasets, typically rely on the availability of labeled data. This requirement can be prohibitive due to the labor-intensive process of labeling data. In light of this, the authors propose SSD, a self-supervised OOD detector that doesn't rely on labeled data.

Methodological Contributions

SSD employs self-supervised representation learning followed by a Mahalanobis distance-based detection mechanism in the feature space. Key to this approach is leveraging advancements in self-supervised learning, specifically contrastive learning techniques, to derive useful feature representations for OOD detection. This eliminates the need for fine-grained labels, making the framework practically attractive.

In the context of contrasting prior work, where autoencoders and density models have failed to scale across data modalities, SSD achieves superior performance. The authors further enrich the methodology by introducing few-shot OOD detection (SSD\textsubscript{k}). Here, the detector leverages minimal OOD samples, improving performance without necessitating a large labeled dataset.

Results and Implications

Quantitatively, SSD demonstrates remarkable improvements in unsupervised OOD detection, showcasing AUROC scores significantly higher than existing unsupervised methods. It also performs competitively with supervised techniques, occasionally surpassing them. The framework is tested on benchmark datasets like CIFAR-10 and CIFAR-100, where it demonstrates strong detection capabilities.

SSD's few-shot extension reveals that even minimal OOD data can enhance detection performance, providing a robust approach in situations where such samples are scarce. This adaptability of the framework potentially broadens its applicability across various domains.

Future Prospects

The theoretical implications of this research suggest a shift in OOD detection paradigms—moving away from reliance on labeled datasets towards exploiting rich, unlabeled data. Practically, this work opens avenues for deploying OOD detectors in real-world scenarios where labeling is impractical. The integration of few-shot learning mechanisms could further enhance deployment adaptability.

Moreover, the authors highlight the potential for SSD to seamlessly assimilate label information when available, combining self-supervised and supervised training methodologies. This could drive future explorations into hybrid models that optimize resource use while ensuring robust OOD detection.

Conclusion

In summary, SSD marks a significant advancement in the landscape of OOD detection by eschewing traditional dependencies on labeled data. Its ability to maintain high performance with minimal supervision presents an exciting progression for both academia and industry. Future developments might explore optimized self-supervised models that further capitalize on unlabeled data's latent potential.

PDF Markdown

Related Papers

GitHub

GitHub - inspire-group/SSD: SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021] (134 stars)