Effective In-Context Example Selection through Data Compression (2405.11465v1)
Abstract: In-context learning has been extensively validated in LLMs. However, the mechanism and selection strategy for in-context example selection, which is a crucial ingredient in this approach, lacks systematic and in-depth research. In this paper, we propose a data compression approach to the selection of in-context examples. We introduce a two-stage method that can effectively choose relevant examples and retain sufficient information about the training dataset within the in-context examples. Our method shows a significant improvement of an average of 5.90% across five different real-world datasets using four LLMs.
- Transformers as statisticians: Provable in-context learning with in-context algorithm selection. arXiv preprint arXiv:2306.04637.
- Tweeteval: Unified benchmark and comparative evaluation for tweet classification. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1644–1650.
- Relatif: Identifying explanatory training samples via relative influence. In International Conference on Artificial Intelligence and Statistics, pages 1899–1909. PMLR.
- Data distributional properties drive emergent in-context learning in transformers. Advances in Neural Information Processing Systems, 35:18878–18891.
- Why can gpt learn in-context? language models secretly perform gradient descent as meta optimizers. arXiv preprint arXiv:2212.10559.
- A survey for in-context learning. arXiv preprint arXiv:2301.00234.
- What can transformers learn in-context? a case study of simple function classes. Advances in Neural Information Processing Systems, 35:30583–30598.
- Coverage-based example selection for in-context learning. arXiv preprint arXiv:2305.14907.
- Self-generated in-context learning: Leveraging auto-regressive language models as a demonstration generator. arXiv preprint arXiv:2206.08082.
- Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. In ICML, volume 70 of Proceedings of Machine Learning Research, pages 1885–1894. PMLR.
- Transformers as algorithms: Generalization and stability in in-context learning. In ICML, volume 202 of Proceedings of Machine Learning Research, pages 19565–19594. PMLR.
- What makes good in-context examples for GPT-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pages 100–114, Dublin, Ireland and Online. Association for Computational Linguistics.
- Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8086–8098, Dublin, Ireland. Association for Computational Linguistics.
- A SICK cure for the evaluation of compositional distributional semantic models. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 216–223, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Rethinking the role of demonstrations: What makes in-context learning work? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11048–11064, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cross-task generalization via natural language crowdsourcing instructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3470–3487.
- Ethos: a multi-label hate speech detection dataset. Complex & Intelligent Systems, 8(6):4663–4678.
- Improving few-shot performance of language models via nearest neighbor calibration. arXiv preprint arXiv:2212.02216.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 3(4):333–389.
- Learning to retrieve prompts for in-context learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2655–2671, Seattle, United States. Association for Computational Linguistics.
- On the effect of pretraining corpora on in-context learning by a large-scale language model. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5168–5186.
- Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research.
- Zhongxiang Sun. 2023. A short survey of viewing large language models in legal aspect. arXiv preprint arXiv:2303.09136.
- Iteratively prompt pre-trained language models for chain of thought. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2714–2730.
- Learning to retrieve in-context examples for large language models. arXiv preprint arXiv:2307.07164.
- Super-naturalinstructions: Generalization via declarative instructions on 1600+ NLP tasks. In EMNLP, pages 5085–5109. Association for Computational Linguistics.
- Neural network acceptability judgments. Transactions of the Association for Computational Linguistics, 7:625–641.
- Dataset pruning: Reducing training data by examining generalization influence. arXiv preprint arXiv:2205.09329.
- Ground-truth labels matter: A deeper look into input-label demonstrations. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2422–2437.
- Trained transformers learn linear models in-context. arXiv preprint arXiv:2306.09927.
Collections
Sign up for free to add this paper to one or more collections.