Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Opinion-based Question Answering Systems Through Label Error Detection and Overwrite (2306.07499v1)

Published 13 Jun 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Label error is a ubiquitous problem in annotated data. Large amounts of label error substantially degrades the quality of deep learning models. Existing methods to tackle the label error problem largely focus on the classification task, and either rely on task specific architecture or require non-trivial additional computations, which is undesirable or even unattainable for industry usage. In this paper, we propose LEDO: a model-agnostic and computationally efficient framework for Label Error Detection and Overwrite. LEDO is based on Monte Carlo Dropout combined with uncertainty metrics, and can be easily generalized to multiple tasks and data sets. Applying LEDO to an industry opinion-based question answering system demonstrates it is effective at improving accuracy in all the core models. Specifically, LEDO brings 1.1% MRR gain for the retrieval model, 1.5% PR AUC improvement for the machine reading comprehension model, and 0.9% rise in the Average Precision for the ranker, on top of the strong baselines with a large-scale social media dataset. Importantly, LEDO is computationally efficient compared to methods that require loss function change, and cost-effective as the resulting data can be used in the same continuous training pipeline for production. Further analysis shows that these gains come from an improved decision boundary after cleaning the label errors existed in the training data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. A closer look at memorization in deep networks. In International conference on machine learning, pages 233–242. PMLR.
  2. Ms marco: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268.
  3. building adaptive acceptability classifiers for neural nlg. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 682–697.
  4. Reading wikipedia to answer open-domain questions. arXiv preprint arXiv:1704.00051.
  5. Understanding and utilizing deep neural networks trained with noisy labels. In International Conference on Machine Learning, pages 1062–1070. PMLR.
  6. Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059. PMLR.
  7. Jacob Goldberger and Ehud Ben-Reuven. 2017. Training deep neural-networks using a noise adaptation layer. In International conference on learning representations.
  8. Co-teaching: Robust training of deep neural networks with extremely noisy labels. Advances in neural information processing systems, 31.
  9. Dureader: a chinese machine reading comprehension dataset from real-world applications. arXiv preprint arXiv:1711.05073.
  10. Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear.
  11. Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In International conference on machine learning, pages 2304–2313. PMLR.
  12. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906.
  13. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
  14. Neural machine reading comprehension: Methods and trends. Applied Sciences, 9(18):3698.
  15. Tongliang Liu and Dacheng Tao. 2015. Classification with noisy labels by importance reweighting. IEEE Transactions on pattern analysis and machine intelligence, 38(3):447–461.
  16. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  17. Self: Learning to filter noisy labels with self-ensembling. arXiv preprint arXiv:1910.01842.
  18. Know what you don’t know: Unanswerable questions for squad. arXiv preprint arXiv:1806.03822.
  19. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 3(4):333–389.
  20. Learning from noisy labels with deep neural networks: A survey. IEEE Transactions on Neural Networks and Learning Systems.
  21. Multi-perspective question answering using the opqa corpus. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 923–930.
  22. Zhilu Zhang and Mert Sabuncu. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. Advances in neural information processing systems, 31.
  23. Distilling effective supervision from severe label noise. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9294–9303.
  24. Retrieving and reading: A comprehensive survey on open-domain question answering. arXiv preprint arXiv:2101.00774.

Summary

We haven't generated a summary for this paper yet.