Generalized Out-of-Distribution Detection: A Survey (2110.11334v3)

Published 21 Oct 2021 in cs.CV, cs.AI, and cs.LG

Abstract: Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen during training time and cannot make a safe decision. The term, OOD detection, first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD), are closely related to OOD detection in terms of motivation and methodology. Despite common goals, these topics develop in isolation, and their subtle differences in definition and problem setting often confuse readers and practitioners. In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. We then review each of these five areas by summarizing their recent technical developments, with a special focus on OOD detection methodologies. We conclude this survey with open challenges and potential research directions.

Citations (756)

View on Semantic Scholar

Summary

The paper introduces a unified framework that integrates anomaly detection, novelty detection, open set recognition, and outlier detection to clarify their interrelations.
The paper presents a comprehensive review of various OOD methods including classification-, density-, distance-, and reconstruction-based approaches.
The paper identifies key future research directions such as robust evaluation benchmarks and integration with tasks like zero-shot learning to enhance model reliability.

Generalized Out-of-Distribution Detection: A Survey

The paper "Generalized Out-of-Distribution Detection: A Survey" provides an exhaustive analysis of out-of-distribution (OOD) detection, a critical component in ensuring the reliability and safety of machine learning systems. The authors present a unified framework that encompasses five related problems: anomaly detection (AD), novelty detection (ND), open set recognition (OSR), out-of-distribution detection, and outlier detection (OD). This approach seeks to clarify the similarities and differences between these sub-topics, providing a coherent structure for understanding their relationships.

The importance of OOD detection is underscored by its application in various safety-critical areas, such as autonomous driving and trustworthy visual recognition systems. The detection of unknown instances is essential for handing control to human operators when machine learning systems encounter unrecognized scenarios, contributing to both reliability and safety.

Key Contributions

Unified Framework: The paper introduces a generalized OOD detection framework that integrates the five problems mentioned above, offering clarity and helping researchers position their work correctly within the landscape. It breaks these problems into specific sub-tasks based on distribution shift, data types, and learning approaches (inductive vs. transductive), enabling a structured comparison across tasks.
Comprehensive Survey for OOD Detection: The presentation of existing methods for OOD detection in the paper encompasses various approaches, including classification-based, density-based, distance-based, and reconstruction-based methods. It connects methodologies across different sub-tasks, providing a comprehensive overview of the field.
Future Research Directions: By identifying open challenges and future opportunities, the paper encourages continued exploration in areas such as covariance shift detection, integration with DA/DG, and more robust evaluation on larger datasets like ImageNet.

Methodological Insights

The paper categorizes OOD detection methodologies into several domains:

Classification-based Methods: These focus on deriving improved OOD scores utilizing softmax confidence calibration, outlier exposure, and gradient information. Some methods extend the label space to improve semantic representation by leveraging hierarchical taxonomies or word embeddings.
Density-based Methods: These involve modeling the in-distribution data with probabilistic models, with challenges arising from paradoxical situations where OOD instances receive high likelihoods.
Distance-based Methods: These depend on the relative distance between test samples and class-centric prototypes in feature space, employing measures like Mahalanobis and cosine similarity.
Reconstruction-based Methods: These capitalize on the assumption that autoencoder models trained on ID data will misconstruct OOD samples.

Experimental and Practical Implications

The survey stresses the necessity of proper evaluation metrics and benchmarks, emphasizing the importance of constructing tests that explicitly distinguish between ID and OOD data. It underscores that while MVTec-AD is a popular benchmark, real-world applications need rigorously curated datasets to prevent category overlap.

The paper also highlights the trade-off between maintaining classification performance and enhancing OOD detection. Effective approaches should balance these requirements, ensuring robust performance across both dimensions.

Challenges and Directions

Key challenges include outlier-free methodologies, more sophisticated evaluations for real-world benchmarks, and integrating OOD detection with broader learning tasks like zero-shot learning and object detection. Collaboration between different methodologies is encouraged to enrich the solution space and facilitate technology transfer among related tasks.

Conclusion

This survey serves as a pivotal point for OOD detection research, providing clarity and direction in a complex field. By defining a generalized framework and exploring individual sub-tasks comprehensively, it sets a solid foundation for future research, methodology innovations, and practical applications. Academic and industrial entities alike can benefit from the insights and directions posited in this paper, enhancing the reliability and scope of machine learning systems in ever-evolving environments.

PDF Markdown

Related Papers

Tweets

https://twitter.com/andrewgwils/status/1783255056667582507

https://twitter.com/VisionBernie/status/1746146517759529443