- The paper provides a unified survey covering anomaly, novelty, open-set, and out-of-distribution detection, aiming to bridge research gaps and promote interdisciplinary synergy in handling unknown inputs.
- It highlights methodologies like open-set recognition systems, OOD detection techniques (MSP, ODIN), and the role of self-supervised learning and generative models in addressing these challenges.
- The survey identifies future research needs including improved self-supervised learning, data augmentation, better evaluation protocols, and addressing adversarial robustness, fairness, and explainability.
A Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution Detection: Solutions and Future Challenges
The academic paper titled "A Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution Detection: Solutions and Future Challenges," authored by Mohammadreza Salehi and colleagues, presents an extensive overview examining the interconnected yet distinct domains of machine learning models dealing with samples diverging from the training distribution. The paper seeks to bridge the research gaps by synthesizing advancements across anomaly detection, novelty detection, one-class learning, open-set recognition, and out-of-distribution detection, promoting interdisciplinary synergy and collective progression in these fields.
Machine learning models typically operate under a closed-set assumption where test data originates from the same distribution as the training data. Nevertheless, in real-world applications, models often encounter diverse test inputs, including those outside the training distribution. This challenge raises significant concerns about reliability and safety in deploying models, particularly in open-world settings. The survey attempts to address the fragmented research on identifying unknowns across various domains characterized by similarity yet tackled predominantly in isolation.
The paper highlights several noteworthy approaches and methodologies. Open-set recognition systems, for example, train on a subset of known classes and strive to correctly classify seen samples while rejecting unseen ones at test time, embodying one of the core methodologies explored within. Similarly, out-of-distribution detection focuses on distinguishing between in-distribution classes and OOD inputs using techniques such as MSP, ODIN, and Mahalanobis distance, among others.
One distinctive merit of the survey is its cross-domain examination, identifying commonalities and differences. There is acknowledgment of the benefits of self-supervised learning tasks as well as the role of generative models in reconstructing inputs, which pose underpinning challenges across these domains. The deployment of autoencoders for anomaly detection and Gaussian descriptors for modeling seen classes illustrates the survey's comprehensive coverage of generative-discriminative models.
The authors speculate on plausible future research directions by discussing the necessity for better-defined self-supervised learning tasks, improved data augmentation strategies, and addressing the limitations of existing evaluation protocols. Additionally, they call for deeper exploration into challenges such as adversarial robustness, fairness, and explainability, particularly when models are applied to real-life applications where unacceptable false positive rates can have critical implications.
The survey extensively covers the importance of datasets as benchmark tools for evaluating detection methods, ranging from semantic-level datasets like MNIST and CIFAR-10 to more domain-specific resources such as MVTec AD for industrial anomaly detection. Furthermore, it emphasizes evaluating models on diverse, realistic shifting distributions to better assess OOD detection methods' practical utility.
Overall, this survey provides a unified cross-domain perspective vital for advancing the safety, reliability, and robustness of machine learning models facing unfamiliar inputs. By addressing the barriers and challenges across anomaly detection, novelty detection, OSR, and OOD fields in a unified narrative, it sets the stage for further interdisciplinary collaboration and development of innovative methodologies that can adapt to the complexities of open-world recognition systems.