Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

In-Context Learning for Text Classification with Many Labels (2309.10954v2)

Published 19 Sep 2023 in cs.CL and cs.LG
In-Context Learning for Text Classification with Many Labels

Abstract: In-context learning (ICL) using LLMs for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a pre-trained dense retrieval model to bypass this limitation, giving the model only a partial view of the full label space for each inference call. Testing with recent open-source LLMs (OPT, LLaMA), we set new state of the art performance in few-shot settings for three common intent classification datasets, with no finetuning. We also surpass fine-tuned performance on fine-grained sentiment classification in certain cases. We analyze the performance across number of in-context examples and different model scales, showing that larger models are necessary to effectively and consistently make use of larger context lengths for ICL. By running several ablations, we analyze the model's use of: a) the similarity of the in-context examples to the current input, b) the semantic content of the class names, and c) the correct correspondence between examples and labels. We demonstrate that all three are needed to varying degrees depending on the domain, contrary to certain recent works.

Introduction

The exploration of in-context learning (ICL) using LLMs entails examining its potential for handling text classification tasks with many labels. To circumvent the inherent limitation of LLMs' constrained context windows, which restrict the number of examples that can be featured in a prompt, researchers have integrated secondary pre-trained retrieval models. This fusion enables the LLMs to ingest only a pertinent subset of labels for each inference, paving the way for application to domains previously deemed infeasible for LLMs' capabilities without the necessity of fine-tuning.

Methodology

This paper introduces a retrieval-augmented ICL where a dense retrieval model, specifically a Sentence-BERT pre-trained on extensive text pair datasets, dynamically identifies a relevant set of examples based on cosine similarity to the input. The research utilizes a "greedy" approach to fill the prompt to its capacity, maximizing the usage of the LLMs' context window. Importantly, the research avoids additional computational costs during inference by having the LLM freely generate output, which is then matched to the closest class using the same retrieval model.

Experimental Insights

The performance gained through the proposed retrieval-augmented ICL is noteworthy, with state-of-the-art (SoTA) strides observed in few-shot settings across various intent classification benchmarks and even outperforming fine-tuned approaches in certain fine-grained sentiment analysis scenarios. Moreover, the research explores the contribution of the semantic content of class names, correct example-label correspondence, and similarity of in-context examples to the current input, deducing their varying degrees of importance across datasets. The paper also reveals model scale to be a crucial factor in leveraging a higher number of in-context examples.

Conclusion and Future Directions

The findings confirm the retrieval-augmented ICL's prowess in addressing multi-label text classification without necessitating further adjustments to the retriever or LLMs, harnessing their pre-training strengths instead. The research points to the larger model architectures being more adept in capitalizing on a broader context when making use of in-context learning. In closing, the paper positions retrieval-augmented ICL as a powerful paradigm for efficiently handling complex classification tasks, introducing a transformative technique in the deployment of LLMs across diverse domains and task scopes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Language Models are Few-shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
  2. Efficient Intent Detection with Dual Sentence Encoders. In Proceedings of the 2nd Workshop on NLP for ConvAI - ACL 2020. Data available at https://github.com/PolyAI-LDN/task-specific-datasets.
  3. PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res., 24:240:1–240:113.
  4. GoEmotions: A Dataset of Fine-grained Emotions. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4040–4054, Online. Association for Computational Linguistics.
  5. State-of-the-art generalisation research in NLP: a taxonomy and review. CoRR, abs/2210.03050.
  6. Dense Passage Retrieval for Open-domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pages 6769–6781. Association for Computational Linguistics.
  7. The Impact of Positional Encoding on Length Generalization in Transformers. CoRR, abs/2305.19466.
  8. An Evaluation Dataset for Intent Classification and Out-of-scope Prediction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pages 1311–1316. Association for Computational Linguistics.
  9. Few-shot Parameter-efficient Fine-tuning is Better and Cheaper than In-context Learning. In NeurIPS.
  10. What Makes Good In-context Examples for GPT-3? In Proceedings of Deep Learning Inside Out: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO at ACL 2022, Dublin, Ireland and Online, May 27, 2022, pages 100–114. Association for Computational Linguistics.
  11. Lost in the Middle: How Language Models Use Long Contexts. CoRR, abs/2307.03172.
  12. Benchmarking Natural Language Understanding Services for Building Conversational Agents. In Increasing Naturalness and Flexibility in Spoken Dialogue Interaction - 10th International Workshop on Spoken Dialogue Systems, IWSDS 2019, Syracuse, Sicily, Italy, 24-26 April 2019, volume 714 of Lecture Notes in Electrical Engineering, pages 165–183. Springer.
  13. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-shot Prompt Order Sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 8086–8098. Association for Computational Linguistics.
  14. DialoGLUE: A Natural Language Understanding Benchmark for Task-oriented Dialogue. CoRR, abs/2009.13570.
  15. Rethinking the Role of Demonstrations: What Makes In-context Learning Work? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 11048–11064. Association for Computational Linguistics.
  16. AdapterHub: A Framework for Adapting Transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020, pages 46–54. Association for Computational Linguistics.
  17. MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7654–7673, Online. Association for Computational Linguistics.
  18. Scaling Language Models: Methods, Analysis & Insights from Training Gopher. CoRR, abs/2112.11446.
  19. In-context Retrieval-augmented Language Models. CoRR, abs/2302.00083.
  20. Impact of Pretraining Term Frequencies on Few-shot Reasoning. CoRR, abs/2202.07206.
  21. Nils Reimers and Iryna Gurevych. 2019a. Sentence-BERT: Sentence Embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
  22. Nils Reimers and Iryna Gurevych. 2019b. Sentence-BERT: Sentence Embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
  23. Learning To Retrieve Prompts for In-context Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022, pages 2655–2671. Association for Computational Linguistics.
  24. REPLUG: Retrieval-augmented Black-box Language Models. CoRR, abs/2301.12652.
  25. LLaMA: Open and Efficient Foundation Language Models. CoRR, abs/2302.13971.
  26. Efficient Few-shot Learning Without Prompts. CoRR, abs/2209.11055.
  27. ConvFiT: Conversational Fine-tuning of Pretrained Language Models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1151–1168, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  28. Larger language models do in-context learning differently. CoRR, abs/2303.03846.
  29. An Explanation of In-context Learning as Implicit Bayesian Inference. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
  30. OPT: Open Pre-trained Transformer Language Models. CoRR, abs/2205.01068.
  31. Calibrate Before Use: Improving Few-shot Performance of Language Models. CoRR, abs/2102.09690.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Aristides Milios (5 papers)
  2. Siva Reddy (82 papers)
  3. Dzmitry Bahdanau (46 papers)
Citations (27)