Understanding Likelihood of Normalizing Flow and Image Complexity through the Lens of Out-of-Distribution Detection (2402.10477v1)
Abstract: Out-of-distribution (OOD) detection is crucial to safety-critical machine learning applications and has been extensively studied. While recent studies have predominantly focused on classifier-based methods, research on deep generative model (DGM)-based methods have lagged relatively. This disparity may be attributed to a perplexing phenomenon: DGMs often assign higher likelihoods to unknown OOD inputs than to their known training data. This paper focuses on explaining the underlying mechanism of this phenomenon. We propose a hypothesis that less complex images concentrate in high-density regions in the latent space, resulting in a higher likelihood assignment in the Normalizing Flow (NF). We experimentally demonstrate its validity for five NF architectures, concluding that their likelihood is untrustworthy. Additionally, we show that this problem can be alleviated by treating image complexity as an independent variable. Finally, we provide evidence of the potential applicability of our hypothesis in another DGM, PixelCNN++.
- Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 427–436, 2015.
- Vos: Learning what you don’t know by virtual outlier synthesis. In Proceedings of the International Conference on Learning Representations, 2022.
- Efficient out-of-distribution detection in digital pathology using multi-head convolutional neural networks. In Tal Arbel, Ismail Ben Ayed, Marleen de Bruijne, Maxime Descoteaux, Herve Lombaert, and Christopher Pal, editors, Proceedings of the Third Conference on Medical Imaging with Deep Learning, volume 121 of Proceedings of Machine Learning Research, pages 465–478. PMLR, 06–08 Jul 2020.
- OpenOOD: Benchmarking generalized out-of-distribution detection. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
- Generalized out-of-distribution detection: A survey. CoRR, abs/2110.11334, 2021.
- Provable guarantees for understanding out-of-distribution detection. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2022.
- Simple and principled uncertainty estimation with deterministic deep learning via distance awareness. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 7498–7512. Curran Associates, Inc., 2020.
- Mos: Towards scaling out-of-distribution detection for large semantic space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8710–8719, June 2021.
- Do deep generative models know what they don’t know? In International Conference on Learning Representations, 2019.
- Waic, but why? generative ensembles for robust anomaly detection, 2018.
- Input complexity and out-of-distribution detection with likelihood-based generative models. In International Conference on Learning Representations, 2020.
- Detecting out-of-distribution inputs to deep generative models using typicality, 2020.
- A. Hyvärinen and E. Oja. Independent component analysis: Algorithms and applications. Neural Netw., 13(4–5):411–430, May 2000.
- Claude E Shannon. A mathematical theory of communication. The Bell system technical journal, 27(3):379–423, 1948.
- Closing the dequantization gap: Pixelcnn as a single-layer flow. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 3724–3734. Curran Associates, Inc., 2020.
- Likelihood ratios for out-of-distribution detection. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Understanding anomaly detection with deep invertible networks through hierarchies of distributions and features. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 21038–21049. Curran Associates, Inc., 2020.
- Why normalizing flows fail to detect out-of-distribution data. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 20578–20589. Curran Associates, Inc., 2020.
- Further analysis of outlier detection with deep generative models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 8982–8992. Curran Associates, Inc., 2020.
- Understanding failures in out-of-distribution detection with deep generative models. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 12427–12436. PMLR, 18–24 Jul 2021.
- A note on the evaluation of generative models. In International Conference on Learning Representations, Apr 2016.
- Modelling image complexity by independent component analysis, with application to content-based image retrieval. In Cesare Alippi, Marios Polycarpou, Christos Panayiotou, and Georgios Ellinas, editors, Artificial Neural Networks – ICANN 2009, pages 704–714, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg.
- Nice: Non-linear independent components estimation. In International Conference on Learning Representations, 2015.
- Pattern recognition and machine learning, volume 4. Springer, 2006.
- An information-theoretic framework for image complexity. In Proceedings of the First Eurographics Conference on Computational Aesthetics in Graphics, Visualization and Imaging, Computational Aesthetics’05, page 177–184, Goslar, DEU, 2005. Eurographics Association.
- Peter D. Grünwald and Paul M. B. Vitányi. Kolmogorov complexity and information theory. with an interpretation in terms of questions and answers. Journal of Logic, Language and Information, 12(4):497–529, 2003.
- Khalid Sayood. Introduction to data compression. Morgan Kaufmann, 2017.
- Exploiting intra-slice and inter-slice redundancy for learning-based lossless volumetric image compression. IEEE Transactions on Image Processing, 31:1697–1707, 2022.
- Likelihood-free out-of-distribution detection with invertible generative models. In Zhi-Hua Zhou, editor, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pages 2119–2125. International Joint Conferences on Artificial Intelligence Organization, 8 2021. Main Track.
- Variational inference with normalizing flows. In Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 1530–1538, Lille, France, 07–09 Jul 2015. PMLR.
- Density estimation using real nvp. In International Conference on Learning Representations, 2017.
- Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems 31, pages 10215–10224. Curran Associates, Inc., 2018.
- Invertible residual networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 573–582. PMLR, 09–15 Jun 2019.
- Residual flows for invertible generative modeling. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Integer discrete flows and lossless compression. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- PixelCNN++: Improving the pixelCNN with discretized logistic mixture likelihood and other modifications. In International Conference on Learning Representations, 2017.
- Chris M Bishop. Training with noise is equivalent to tikhonov regularization. Neural computation, 7(1):108–116, 1995.
- Herbert Federer. Geometric measure theory. Springer, 2014.
- Learning multiple layers of features from tiny images. 2009.
- Understanding and mitigating exploding inverses in invertible neural networks. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 of Proceedings of Machine Learning Research, pages 1792–1800. PMLR, 13–15 Apr 2021.
- Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
- Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. CoRR, abs/1506.03365, 2015.
- Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338, 2015.
- Yaroslav Bulatov. Notmnist dataset. Google (Books/OCR), Tech. Rep.[Online]. Available: http://yaroslavvb. blogspot. it/2011/09/notmnist-dataset. html, 2, 2011.
- Auto-encoding variational bayes. In International Conference on Learning Representations, 2014.
- Mu Cai and Yixuan Li. Out-of-distribution detection via frequency-regularized generative models. In Proceedings of IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023.
- Elements of information theory. John Wiley & Sons, 2012.
- Carlos Fernandez-Granda. Optimization-based data analysis: Lecture notes 3: Randomness, Fall 2017.
- Compression with flows via local bits-back coding. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Genki Osada (4 papers)
- Tsubasa Takahashi (20 papers)
- Takashi Nishide (4 papers)