VI-OOD: A Unified Representation Learning Framework for Textual Out-of-distribution Detection (2404.06217v1)
Abstract: Out-of-distribution (OOD) detection plays a crucial role in ensuring the safety and reliability of deep neural networks in various applications. While there has been a growing focus on OOD detection in visual data, the field of textual OOD detection has received less attention. Only a few attempts have been made to directly apply general OOD detection methods to NLP tasks, without adequately considering the characteristics of textual data. In this paper, we delve into textual OOD detection with Transformers. We first identify a key problem prevalent in existing OOD detection methods: the biased representation learned through the maximization of the conditional likelihood $p(y\mid x)$ can potentially result in subpar performance. We then propose a novel variational inference framework for OOD detection (VI-OOD), which maximizes the likelihood of the joint distribution $p(x, y)$ instead of $p(y\mid x)$. VI-OOD is tailored for textual OOD detection by efficiently exploiting the representations of pre-trained Transformers. Through comprehensive experiments on various text classification tasks, VI-OOD demonstrates its effectiveness and wide applicability. Our code has been released at \url{https://github.com/liam0949/LLM-OOD}.
- A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VI, pages 548–564. Springer.
- Enhancing out-of-distribution detection in natural language understanding via implicit layer ensemble. arXiv preprint arXiv:2210.11034.
- Waic, but why? generative ensembles for robust anomaly detection. arXiv preprint arXiv:1810.01392.
- BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 4171–4186. Association for Computational Linguistics.
- Can autonomous vehicles identify, recover from, and adapt to distribution shifts? In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 3145–3153. PMLR.
- The tilted variational autoencoder: Improving out-of-distribution detection. In The Eleventh International Conference on Learning Representations.
- Scaling out-of-distribution detection for real-world settings. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pages 8759–8773. PMLR.
- Dan Hendrycks and Kevin Gimpel. 2017. A baseline for detecting misclassified and out-of-distribution examples in neural networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.
- Pretrained transformers improve out-of-distribution robustness. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2744–2751.
- Deep anomaly detection with outlier exposure. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net.
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, 9(8):1735–1780.
- Ryo Kamoi and Kei Kobayashi. 2020. Why is the mahalanobis distance effective for anomaly detection? CoRR, abs/2003.00402.
- Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings.
- Training confidence-calibrated classifiers for detecting out-of-distribution samples. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net.
- A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 7167–7177.
- How good are large language models at out-of-distribution detection? arXiv preprint arXiv:2308.10261.
- Energy-based out-of-distribution detection. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
- Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
- Universal text representation from BERT: an empirical study. CoRR, abs/1910.07973.
- Exploring the role of BERT token representations to explain sentence probing results. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, pages 792–806. Association for Computational Linguistics.
- Density of states estimation for out of distribution detection. In The 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021, April 13-15, 2021, Virtual Event, volume 130 of Proceedings of Machine Learning Research, pages 3232–3240. PMLR.
- Do deep generative models know what they don’t know? In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net.
- Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, pages 427–436. IEEE Computer Society.
- Revisiting mahalanobis distance for transformer-based out-of-domain detection. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021, pages 13675–13682. AAAI Press.
- Revisiting mahalanobis distance for transformer-based out-of-domain detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13675–13682.
- Likelihood ratios for out-of-distribution detection. Advances in neural information processing systems, 32.
- How to fine-tune BERT for text classification? In Chinese Computational Linguistics - 18th China National Conference, CCL 2019, Kunming, China, October 18-20, 2019, Proceedings, volume 11856 of Lecture Notes in Computer Science, pages 194–206. Springer.
- React: Out-of-distribution detection with rectified activations. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 144–157.
- Out-of-distribution detection with deep nearest neighbors. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pages 20827–20840. PMLR.
- Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings.
- Trust issues: Uncertainty estimation does not enable reliable ood detection on medical tabular data. In Machine Learning for Health, pages 341–354. PMLR.
- Is fine-tuning needed? pre-trained language models are near perfect for out-of-domain detection. In Annual Meeting of the Association for Computational Linguistics.
- Attention is all you need. Advances in neural information processing systems, 30.
- Likelihood regret: An out-of-distribution detection score for variational auto-encoder. Advances in neural information processing systems, 33:20685–20696.
- OpenOOD: Benchmarking generalized out-of-distribution detection. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
- Out-of-scope intent detection with self-supervision and discriminative training. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3521–3532, Online. Association for Computational Linguistics.
- Deep open intent classification with adaptive decision boundary. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14374–14382.
- Contrastive out-of-distribution detection for pretrained transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, pages 1100–1111. Association for Computational Linguistics.
- Contrastive out-of-distribution detection for pretrained transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1100–1111.
- Li-Ming Zhan (10 papers)
- Bo Liu (484 papers)
- Xiao-Ming Wu (91 papers)