A Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution Detection: Solutions and Future Challenges (2110.14051v5)

Published 26 Oct 2021 in cs.CV and cs.LG

Abstract: Machine learning models often encounter samples that are diverged from the training distribution. Failure to recognize an out-of-distribution (OOD) sample, and consequently assign that sample to an in-class label significantly compromises the reliability of a model. The problem has gained significant attention due to its importance for safety deploying models in open-world settings. Detecting OOD samples is challenging due to the intractability of modeling all possible unknown distributions. To date, several research domains tackle the problem of detecting unfamiliar samples, including anomaly detection, novelty detection, one-class learning, open set recognition, and out-of-distribution detection. Despite having similar and shared concepts, out-of-distribution, open-set, and anomaly detection have been investigated independently. Accordingly, these research avenues have not cross-pollinated, creating research barriers. While some surveys intend to provide an overview of these approaches, they seem to only focus on a specific domain without examining the relationship between different domains. This survey aims to provide a cross-domain and comprehensive review of numerous eminent works in respective areas while identifying their commonalities. Researchers can benefit from the overview of research advances in different fields and develop future methodology synergistically. Furthermore, to the best of our knowledge, while there are surveys in anomaly detection or one-class learning, there is no comprehensive or up-to-date survey on out-of-distribution detection, which our survey covers extensively. Finally, having a unified cross-domain perspective, we discuss and shed light on future lines of research, intending to bring these fields closer together.

Authors (6)

Mohammadreza Salehi (26 papers)
Hossein Mirzaei (11 papers)
Dan Hendrycks (63 papers)
Yixuan Li (183 papers)
Mohammad Hossein Rohban (43 papers)
Mohammad Sabokrou (53 papers)

Citations (172)

View on Semantic Scholar

Summary

A Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution Detection: Solutions and Future Challenges

The academic paper titled "A Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution Detection: Solutions and Future Challenges," authored by Mohammadreza Salehi and colleagues, presents an extensive overview examining the interconnected yet distinct domains of machine learning models dealing with samples diverging from the training distribution. The paper seeks to bridge the research gaps by synthesizing advancements across anomaly detection, novelty detection, one-class learning, open-set recognition, and out-of-distribution detection, promoting interdisciplinary synergy and collective progression in these fields.

Machine learning models typically operate under a closed-set assumption where test data originates from the same distribution as the training data. Nevertheless, in real-world applications, models often encounter diverse test inputs, including those outside the training distribution. This challenge raises significant concerns about reliability and safety in deploying models, particularly in open-world settings. The survey attempts to address the fragmented research on identifying unknowns across various domains characterized by similarity yet tackled predominantly in isolation.

The paper highlights several noteworthy approaches and methodologies. Open-set recognition systems, for example, train on a subset of known classes and strive to correctly classify seen samples while rejecting unseen ones at test time, embodying one of the core methodologies explored within. Similarly, out-of-distribution detection focuses on distinguishing between in-distribution classes and OOD inputs using techniques such as MSP, ODIN, and Mahalanobis distance, among others.

One distinctive merit of the survey is its cross-domain examination, identifying commonalities and differences. There is acknowledgment of the benefits of self-supervised learning tasks as well as the role of generative models in reconstructing inputs, which pose underpinning challenges across these domains. The deployment of autoencoders for anomaly detection and Gaussian descriptors for modeling seen classes illustrates the survey's comprehensive coverage of generative-discriminative models.

The authors speculate on plausible future research directions by discussing the necessity for better-defined self-supervised learning tasks, improved data augmentation strategies, and addressing the limitations of existing evaluation protocols. Additionally, they call for deeper exploration into challenges such as adversarial robustness, fairness, and explainability, particularly when models are applied to real-life applications where unacceptable false positive rates can have critical implications.

The survey extensively covers the importance of datasets as benchmark tools for evaluating detection methods, ranging from semantic-level datasets like MNIST and CIFAR-10 to more domain-specific resources such as MVTec AD for industrial anomaly detection. Furthermore, it emphasizes evaluating models on diverse, realistic shifting distributions to better assess OOD detection methods' practical utility.

Overall, this survey provides a unified cross-domain perspective vital for advancing the safety, reliability, and robustness of machine learning models facing unfamiliar inputs. By addressing the barriers and challenges across anomaly detection, novelty detection, OSR, and OOD fields in a unified narrative, it sets the stage for further interdisciplinary collaboration and development of innovative methodologies that can adapt to the complexities of open-world recognition systems.

Related Papers

Find Related Papers