Papers
Topics
Authors
Recent
2000 character limit reached

On the Noise Robustness of In-Context Learning for Text Generation

Published 27 May 2024 in cs.CL and cs.LG | (2405.17264v3)

Abstract: LLMs have shown impressive performance on downstream tasks by in-context learning (ICL), which heavily relies on the quality of demonstrations selected from a large set of annotated examples. Recent works claim that in-context learning is robust to noisy demonstrations in text classification. In this work, we show that, on text generation tasks, noisy annotations significantly hurt the performance of in-context learning. To circumvent the issue, we propose a simple and effective approach called Local Perplexity Ranking (LPR), which replaces the "noisy" candidates with their nearest neighbors that are more likely to be clean. Our method is motivated by analyzing the perplexity deviation caused by noisy labels and decomposing perplexity into inherent perplexity and matching perplexity. Our key idea behind LPR is thus to decouple the matching perplexity by performing the ranking among the neighbors in semantic space. Our approach can prevent the selected demonstrations from including mismatched input-label pairs while preserving the effectiveness of the original selection methods. Extensive experiments demonstrate the effectiveness of LPR, improving the EM score by up to 18.75 on common benchmarks with noisy annotations. Our code is available at https://github.com/ml-stat-Sustech/Local-Perplexity-Ranking.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Handling realistic label noise in BERT text classification. In Proceedings of the 6th International Conference on Natural Language and Speech Processing, pages 11–20, 2023.
  2. Does noise really matter? investigation into the influence of noisy labels on bert-based question answering system. International Journal of Semantic Computing, pages 1–20, 2024.
  3. Detecting language model attacks with perplexity. arXiv preprint arXiv:2308.14132, 2024.
  4. Types of out-of-distribution texts and how to detect them. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10687–10701, 2021.
  5. Semantic parsing on freebase from question-answer pairs. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1533–1544, 2013.
  6. GPT-Neo: Large scale autoregressive language modeling with mesh-tensorflow. 10.5281/zenodo.5297715, 2021.
  7. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 1877–1901, 2020.
  8. Two wrongs don’t make a right: Combating confirmation bias in learning with label noise. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 14765–14773, 2023.
  9. Exploring the robustness of in-context learning with noisy labels. arXiv preprint arXiv:2404.18191, 2024.
  10. Meta-in-context learning in large language models. Advances in Neural Information Processing Systems, 36:65189–65201, 2023.
  11. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, 2019.
  12. Mitigating label biases for in-context learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14014–14031, 2023.
  13. Demystifying prompts in language models via perplexity estimation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10136–10148, 2023.
  14. Pre-training to learn in context. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4849–4870, 2023.
  15. How robust are llms to in-context majority label bias? arXiv preprint arXiv:2312.16549, 2023.
  16. Coverage-based example selection for in-context learning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13924–13950, 2023.
  17. The curious case of neural text degeneration. In International Conference on Learning Representations, 2020.
  18. Nlip: Noise-robust language-image pre-training. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 926–934, 2023.
  19. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.
  20. In-context learning learns label relationships but is not conventional learning. In International Conference on Learning Representations, 2024.
  21. Natural questions: A benchmark for question answering research. Transactions of the Association for Computational Linguistics, pages 452–466, 2019.
  22. Diverse demonstrations improve in-context compositional generalization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1401–1422, 2023.
  23. Unified demonstration retriever for in-context learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4644–4668, 2023.
  24. Finding support examples for in-context learning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6219–6235, 2023.
  25. NL2Bash: A corpus and semantic parser for natural language interface to the linux operating system. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation), 2018.
  26. What makes good in-context examples for GPT-3? In Proceedings of Deep Learning Inside Out: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pages 100–114, 2022.
  27. Decomposing label space, format and discrimination: Rethinking how llms respond and solve tasks via in-context learning. arXiv preprint arXiv:2404.07546, 2024.
  28. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8086–8098, 2022.
  29. Dr.icl: Demonstration-retrieved in-context learning. arXiv preprint arXiv:2305.14128, 2023.
  30. Z-ICL: Zero-shot in-context learning with pseudo-demonstrations. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2304–2317, 2023.
  31. Normalized loss functions for deep learning with noisy labels. In Proceedings of the 37th International Conference on Machine Learning, 2020.
  32. Which examples to annotate for in-context learning? towards effective and efficient selection. arXiv preprint arXiv:2310.20046, 2023.
  33. In-context learning for text classification with many labels. In Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP, pages 173–184, 2023.
  34. MetaICL: Learning to learn in context. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2791–2809, 2022.
  35. Rethinking the role of demonstrations: What makes in-context learning work? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11048–11064, 2022.
  36. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2024.
  37. MultiTabQA: Generating tabular answers for multi-table question answering. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6322–6334, 2023.
  38. What in-context learning “learns” in-context: Disentangling task recognition and task learning. In Findings of the Association for Computational Linguistics: ACL 2023, pages 8298–8319, 2023.
  39. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, 2002.
  40. Seq-UPS: Sequential uncertainty-aware pseudo-label selection for semi-supervised text recognition. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision, pages 6169–6179, 2023.
  41. Revisiting demonstration selection strategies in in-context learning. arXiv preprint arXiv:2401.12087, 2024.
  42. In-context learning with iterative demonstration selection. arXiv preprint arXiv:2310.09881, 2023.
  43. Evaluating the impact of model scale for compositional generalization in semantic parsing. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9157–9179, 2022.
  44. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383–2392, 2016.
  45. Learning to retrieve prompts for in-context learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2655–2671, 2022.
  46. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, 2013.
  47. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  48. Label words are anchors: An information flow perspective for understanding in-context learning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 9840–9855, 2023.
  49. Large language models are latent variable models: Explaining and finding good demonstrations for in-context learning. In Advances in Neural Information Processing Systems, pages 15614–15638, 2023.
  50. Mitigating memorization of noisy labels by clipping the model prediction. In Proceedings of the 40th International Conference on Machine Learning, 2023.
  51. Symbol tuning improves in-context learning in language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 968––979, 2023.
  52. Larger language models do in-context learning differently. arXiv preprint arXiv:2303.03846, 2023.
  53. Crowdsourcing multiple choice science questions. In Proceedings of the 3rd Workshop on Noisy User-generated Text, pages 94–106, 2017.
  54. Multi-level knowledge distillation for out-of-distribution detection in text. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7317–7332, 2023.
  55. LLMs as workers in human-computational algorithms? replicating crowdsourcing pipelines with llms. arXiv preprint arXiv:2307.10168, 2023.
  56. OpenICL: An open-source framework for in-context learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 489–498, 2023.
  57. Self-adaptive in-context learning: An information compression perspective for in-context example selection and ordering. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1423–1436, 2023.
  58. Learning from multiple annotators with varying expertise. Machine Learning, page 291–327, 2014.
  59. Compositional exemplars for in-context learning. In Proceedings of the 40th International Conference on Machine Learning, pages 39818––39833, 2023.
  60. Active negative loss functions for learning with noisy labels. In Advances in Neural Information Processing Systems, pages 6917–6940, 2023.
  61. Noisy pair corrector for dense retrieval. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 11439–11451, 2023.
  62. IDEAL: Influence-driven selective annotations empower in-context learners in large language models. In International Conference on Learning Representations, 2024.
  63. OPT: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068, 2022.
  64. Character-level convolutional networks for text classification. Advances in Neural Information Processing Systems, 28, 2015.
  65. What makes good examples for visual in-context learning? Advances in Neural Information Processing Systems, 36, 2024.
  66. Semi-supervised text detection with accurate pseudo-labels. IEEE Signal Processing Letters, pages 1272–1276, 2022.
  67. Is BERT robust to label noise? a study on learning with noisy labels in text classification. In Proceedings of the Third Workshop on Insights from Negative Results in NLP, pages 62–67, 2022.
  68. Detecting corrupted labels without training a model to predict. In Proceedings of the 39th International Conference on Machine Learning, pages 27412–27427, 2022.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 2 likes about this paper.