Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Complementary Explanations for Effective In-Context Learning (2211.13892v2)

Published 25 Nov 2022 in cs.CL

Abstract: LLMs have exhibited remarkable capabilities in learning from explanations in prompts, but there has been limited understanding of exactly how these explanations function or why they are effective. This work aims to better understand the mechanisms by which explanations are used for in-context learning. We first study the impact of two different factors on the performance of prompts with explanations: the computation trace (the way the solution is decomposed) and the natural language used to express the prompt. By perturbing explanations on three controlled tasks, we show that both factors contribute to the effectiveness of explanations. We further study how to form maximally effective sets of explanations for solving a given test query. We find that LLMs can benefit from the complementarity of the explanation set: diverse reasoning skills shown by different exemplars can lead to better performance. Therefore, we propose a maximal marginal relevance-based exemplar selection approach for constructing exemplar sets that are both relevant as well as complementary, which successfully improves the in-context learning performance across three real-world tasks on multiple LLMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Explanations for CommonsenseQA: New Dataset and Models. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL).
  2. PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. arXiv, abs/2202.01279.
  3. Language models are few-shot learners. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS).
  4. e-snli: Natural language inference with natural language explanations. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS).
  5. Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Association for Computing Machinery.
  6. Evaluating large language models trained on code. ArXiv, abs/2107.03374.
  7. Meta-learning via language model in-context tuning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
  8. Palm: Scaling language modeling with pathways. ArXiv, abs/2204.02311.
  9. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168.
  10. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT).
  11. Prototypical calibration for few-shot learning of language models. ArXiv, abs/2205.10183.
  12. Surface form competition: Why the highest probability answer isn’t always right. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.
  13. Maieutic prompting: Logically consistent reasoning with recursive explanations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
  14. Large language models are zero-shot reasoners. ArXiv, abs/2205.11916.
  15. What makes good in-context examples for gpt-3? ArXiv, abs/2101.06804.
  16. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
  17. Language models of code are few-shot commonsense learners. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
  18. Noisy channel language model prompting for few-shot text classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
  19. MetaICL: Learning to learn in context. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
  20. Rethinking the role of demonstrations: What makes in-context learning work? In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
  21. Show your work: Scratchpads for intermediate computation with language models. ArXiv, abs/2112.00114.
  22. Training language models to follow instructions with human feedback. ArXiv, abs/2203.02155.
  23. Measuring and narrowing the compositionality gap in language models. ArXiv, abs/2210.03350.
  24. Evaluating the impact of model scale for compositional generalization in semantic parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
  25. Learning to retrieve prompts for in-context learning. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL).
  26. Constrained language models yield few-shot semantic parsers. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
  27. Selective annotation makes language models better few-shot learners. arXiv preprint arXiv:2209.01975.
  28. Rationale-augmented ensembles in language models. ArXiv, abs/2207.00747.
  29. Self-consistency improves chain of thought reasoning in language models. ArXiv, abs/2203.11171.
  30. Chain of thought prompting elicits reasoning in large language models. ArXiv, abs/2201.11903.
  31. An explanation of in-context learning as implicit bayesian inference. In International Conference on Learning Representations.
  32. The unreliability of explanations in few-shot prompting for textual reasoning. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS).
  33. OPT: Open Pre-trained Transformer Language Models. ArXiv, abs/2205.01068.
  34. BERTScore: Evaluating Text Generation with BERT. In Proceedings of the International Conference on Learning Representations (ICLR).
  35. Calibrate before use: Improving few-shot performance of language models. In Proceedings of the International Conference on Learning Representations (ICLR).
  36. Least-to-most prompting enables complex reasoning in large language models. ArXiv, abs/2205.10625.
  37. Teaching algorithmic reasoning via in-context learning. ArXiv, abs/2211.09066.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xi Ye (33 papers)
  2. Srinivasan Iyer (20 papers)
  3. Asli Celikyilmaz (80 papers)
  4. Ves Stoyanov (15 papers)
  5. Greg Durrett (117 papers)
  6. Ramakanth Pasunuru (32 papers)
Citations (76)
Github Logo Streamline Icon: https://streamlinehq.com