Toward a Realistic Benchmark for Out-of-Distribution Detection (2404.10474v1)
Abstract: Deep neural networks are increasingly used in a wide range of technologies and services, but remain highly susceptible to out-of-distribution (OOD) samples, that is, drawn from a different distribution than the original training set. A common approach to address this issue is to endow deep neural networks with the ability to detect OOD samples. Several benchmarks have been proposed to design and validate OOD detection techniques. However, many of them are based on far-OOD samples drawn from very different distributions, and thus lack the complexity needed to capture the nuances of real-world scenarios. In this work, we introduce a comprehensive benchmark for OOD detection, based on ImageNet and Places365, that assigns individual classes as in-distribution or out-of-distribution depending on the semantic similarity with the training set. Several techniques can be used to determine which classes should be considered in-distribution, yielding benchmarks with varying properties. Experimental results on different OOD detection techniques show how their measured efficacy depends on the selected benchmark and how confidence-based techniques may outperform classifier-based ones on near-OOD samples.
- “A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: Solutions and future challenges” In arXiv preprint arXiv:2110.14051, 2021
- “Generalized out-of-distribution detection: A survey” In arXiv preprint arXiv:2110.11334, 2021
- “Concrete problems in AI safety” In arXiv preprint arXiv:1606.06565, 2016
- Matthias Hein, Maksym Andriushchenko and Julian Bitterwolf “Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 41–50
- Robert E Wilson, Samuel D Gosling and Lindsay T Graham “A review of Facebook research in the social sciences” In Perspectives on psychological science 7.3 Sage Publications Sage CA: Los Angeles, CA, 2012, pp. 203–220
- “Methods for social media monitoring related to vaccination: systematic scoping review” In JMIR public health and surveillance 7.2 JMIR Publications Inc., Toronto, Canada, 2021, pp. e17149
- “What artificial intelligence tells us about ourselves. Starting from the experience of FACETS and FRESCO” In FACE IT! The new challenges of visual semiotics Routledge
- “Machine learning approach to auto-tagging online content for content marketing efficiency: A comparative analysis between methods and content type” In Journal of Business Research 101 Elsevier, 2019, pp. 203–217
- “No true state-of-the-art? OOD detection methods are inconsistent across datasets” In arXiv preprint arXiv:2109.05554, 2021
- Stanislav Fort, Jie Ren and Balaji Lakshminarayanan “Exploring the limits of out-of-distribution detection” In Advances in Neural Information Processing Systems 34, 2021, pp. 7068–7081
- Da-Wei Zhou, Han-Jia Ye and De-Chuan Zhan “Learning placeholders for open-set recognition” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 4401–4410
- “Detecting out-of-distribution inputs in deep neural networks using an early-layer output” In arXiv preprint arXiv:1910.10307, 2019
- Dan Hendrycks, Mantas Mazeika and Thomas G. Dietterich “Deep anomaly detection with outlier exposure” In arXiv preprint arXiv:1812.04606, 2019
- “Background data resampling for outlier-aware classification” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 13218–13227
- “Energy-based out-of-distribution detection” In Advances in neural information processing systems 33, 2020, pp. 21464–21475
- “Predictive uncertainty estimation via prior networks” In Advances in neural information processing systems 31, 2018
- “Are out-of-distribution detection methods effective on large-scale datasets?” In arXiv preprint arXiv:1910.14034, 2019
- “Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series” In Proceedings of the VLDB Endowment (PVLDB), 2021
- Sebastian Schmidl, Phillip Wenig and Thorsten Papenbrock “Anomaly detection in time series: a comprehensive evaluation” In Proceedings of the VLDB Endowment 15.9 VLDB Endowment, 2022, pp. 1779–1797
- “A baseline for detecting misclassified and out-of-distribution examples in neural networks” In arXiv preprint arXiv:1610.02136, 2016
- Shiyu Liang, Yixuan Li and R Srikant “Enhancing the reliability of out-of-distribution image detection in neural networks” In 6th International Conference on Learning Representations, ICLR 2018, 2018
- “Generalized ODIN: Detecting out-of-distribution image without learning from out-of-distribution data” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10951–10960
- “MOS: Towards scaling out-of-distribution detection for large semantic space” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8710–8719
- “Scaling Out-of-Distribution Detection for Real-World Settings” In International Conference on Machine Learning, 2022, pp. 8759–8773 PMLR
- Abhijit Bendale and Terrance E Boult “Towards open set deep networks” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 1563–1572
- Julian Bitterwolf, Alexander Meinke and Matthias Hein “Certifiably Adversarially Robust Detection of Out-of-Distribution Data” In Advances in Neural Information Processing Systems 33, 2020, pp. 16085–16095
- Alex Krizhevsky “Learning Multiple Layers of Features from Tiny Images”, 2009
- “Gradient-based learning applied to document recognition” In Proceedings of the IEEE 86.11, 1998, pp. 2278–2324
- Han Xiao, Kashif Rasul and Roland Vollgraf “Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms” In arXiv preprint arXiv:1708.07747, 2017
- “Tiny imagenet visual recognition challenge” In CS 231N, 2015
- “LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop” In arXiv preprint arXiv:1506.03365, 2015
- “Describing textures in the wild” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 3606–3613
- “Places: A 10 million image database for scene recognition” In IEEE transactions on pattern analysis and machine intelligence 40.6 IEEE, 2017, pp. 1452–1464
- “Object detectors emerge in deep scene CNNs” In arXiv preprint arXiv:1412.6856, 2014
- “Combining local context and WordNet similarity for word sense identification” In WordNet: An electronic lexical database 49.2, 1998, pp. 265–283
- “Verbs semantics and lexical selection” In Proceedings of the 32nd annual meeting on Association for Computational Linguistics, 1994, pp. 133–138
- “SUN database: Large-scale scene recognition from abbey to zoo” In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 3485–3492
- “Detecting semantic anomalies” In Proceedings of the AAAI Conference on Artificial Intelligence 34.04, 2020, pp. 3154–3162
- “Semantically coherent out-of-distribution detection” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8301–8309
- Pietro Recalcati (1 paper)
- Fabio Garcea (4 papers)
- Luca Piano (5 papers)
- Fabrizio Lamberti (17 papers)
- Lia Morra (16 papers)