Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation? (2407.02062v2)
Abstract: This work investigates the impact of data augmentation on confidence calibration and uncertainty estimation in Named Entity Recognition (NER) tasks. For the future advance of NER in safety-critical fields like healthcare and finance, it is essential to achieve accurate predictions with calibrated confidence when applying Deep Neural Networks (DNNs), including Pre-trained LLMs (PLMs), as a real-world application. However, DNNs are prone to miscalibration, which limits their applicability. Moreover, existing methods for calibration and uncertainty estimation are computational expensive. Our investigation in NER found that data augmentation improves calibration and uncertainty in cross-genre and cross-lingual setting, especially in-domain setting. Furthermore, we showed that the calibration for NER tends to be more effective when the perplexity of the sentences generated by data augmentation is lower, and that increasing the size of the augmentation further improves calibration and uncertainty.
- Contextual string embeddings for sequence labeling. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1638–1649, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, page 2623–2631, New York, NY, USA. Association for Computing Machinery.
- Pitfalls of in-domain uncertainty estimation and ensembling in deep learning. In International Conference on Learning Representations.
- Local additivity based data augmentation for semi-supervised NER. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1241–1251, Online. Association for Computational Linguistics.
- Data augmentation for cross-domain named entity recognition. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5346–5356, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Style transfer as data augmentation: A case study on named entity recognition. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1827–1841, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- B.R. Chiswick and Paul Miller. 2005. Linguistic distance: A quantitative measure of the distance between english and other languages. Journal of Multilingual and Multicultural Development, 26:1–11.
- Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
- Named entity recognition in travel-related search queries. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, page 3935–3941. AAAI Press.
- Xiang Dai and Heike Adel. 2020. An analysis of simple data augmentation for named entity recognition. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3861–3867, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Shrey Desai and Greg Durrett. 2020. Calibration of pre-trained transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 295–302, Online. Association for Computational Linguistics.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Jerome H. Friedman. 2000. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189–1232.
- SpanNER: Named entity re-/recognition as span prediction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 7183–7195, Online. Association for Computational Linguistics.
- Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 1050–1059, New York, New York, USA. PMLR.
- Accurate, large minibatch SGD: training imagenet in 1 hour. abs/1706.02677.
- On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1321–1330. PMLR.
- DeBERTav3: Improving deBERTa using ELECTRA-style pre-training with gradient-disentangled embedding sharing. In The Eleventh International Conference on Learning Representations.
- Deberta: Decoding-enhanced bert with disentangled attention. In International Conference on Learning Representations.
- Entity-to-text based data augmentation for various named entity recognition tasks. In Findings of the Association for Computational Linguistics: ACL 2023, pages 9072–9087, Toronto, Canada. Association for Computational Linguistics.
- Improving predictions of bayesian neural nets via local linearization. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 of Proceedings of Machine Learning Research, pages 703–711. PMLR.
- Abhyuday Jagannatha and Hong Yu. 2020. Calibrating structured output predictors for natural language processing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2078–2092, Online. Association for Computational Linguistics.
- Calibrating zero-shot cross-lingual (un-)structured predictions. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2648–2674, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Bag of tricks for in-distribution calibration of pretrained transformers. In Findings of the Association for Computational Linguistics: EACL 2023, pages 551–563, Dubrovnik, Croatia. Association for Computational Linguistics.
- Masking private user information using natural language processing. International Journal of Advance Research in Computer Science and Management, 7:1753–1763.
- Volodymyr Kuleshov and Percy S Liang. 2015. Calibrated structured prediction. In Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc.
- Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in Neural Information Processing Systems, page 6405–6416.
- Neural architectures for named entity recognition. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 260–270, San Diego, California. Association for Computational Linguistics.
- Tokenization impacts multilingual language modeling: Assessing vocabulary allocation and overlap across languages. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5661–5681, Toronto, Canada. Association for Computational Linguistics.
- Soft augmentation for image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16241–16250.
- Roberta: A robustly optimized BERT pretraining approach. abs/1907.11692.
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In International Conference on Learning Representations.
- Andrey Malinin and Mark Gales. 2021. Uncertainty estimation in autoregressive structured prediction. In International Conference on Learning Representations.
- MultiCoNER: A large-scale multilingual dataset for complex named entity recognition. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3798–3809, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- A global optimization technique for statistical classifier design. IEEE Trans. Signal Process., 44:3108–3122.
- George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41.
- Obtaining well calibrated probabilities using bayesian binning. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, page 2901–2907. AAAI Press.
- Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
- Cross-lingual name tagging and linking for 282 languages. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1946–1958, Vancouver, Canada. Association for Computational Linguistics.
- Regularizing neural networks by penalizing confident output distributions. In Proceedings of the Inter- national Conference on Learning Representations (Workshop).
- Towards robust linguistic analysis using OntoNotes. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 143–152, Sofia, Bulgaria. Association for Computational Linguistics.
- Language models are unsupervised multitask learners.
- Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast—Choose Three. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5583–5595, Online. Association for Computational Linguistics.
- Re-examining calibration: The case of question answering. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2814–2829, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56):1929–1958.
- Exploring predictive uncertainty and calibration in NLP: A study on the impact of method & data scarcity. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2707–2735, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579–2605.
- Event identification in web social media through named entity recognition and topic modeling. Data & Knowledge Engineering, 88:1–24.
- Combining ensembles and data augmentation can harm your calibration. In International Conference on Learning Representations.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Yijun Xiao and William Yang Wang. 2019. Quantifying uncertainties in natural language processing tasks. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’19/IAAI’19/EAAI’19. AAAI Press.
- Proximity-informed calibration for deep neural networks. In Thirty-seventh Conference on Neural Information Processing Systems.
- A unified generative framework for various NER subtasks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5808–5822, Online. Association for Computational Linguistics.
- mixup: Beyond empirical risk minimization. In International Conference on Learning Representations.
- When and how mixup improves calibration. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 26135–26160. PMLR.
- MELM: Data augmentation with masked entity language modeling for low-resource NER. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2251–2262, Dublin, Ireland. Association for Computational Linguistics.