Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Crafting In-context Examples according to LMs' Parametric Knowledge (2311.09579v2)

Published 16 Nov 2023 in cs.CL

Abstract: In-context learning can improve the performances of knowledge-rich tasks such as question answering. In such scenarios, in-context examples trigger a LLM (LM) to surface information stored in its parametric knowledge. We study how to better construct in-context example sets, based on whether the model is aware of the in-context examples. We identify 'known' examples, where models can correctly answer from their parametric knowledge, and 'unknown' ones. Our experiments show that prompting with 'unknown' examples decreases the performance, potentially as it encourages hallucination rather than searching for its parametric knowledge. Constructing an in-context example set that presents both known and unknown information performs the best across diverse settings. We perform analysis on three multi-answer question answering datasets, which allows us to further study answer set ordering strategies based on the LM's knowledge of each answer. Together, our study sheds light on how to best construct in-context example sets for knowledge-rich tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. In-context examples selection for machine translation. In Annual Meeting of the Association for Computational Linguistics.
  2. Qampari: An open-domain question answering benchmark for questions with many answers from multiple paragraphs. ArXiv, abs/2205.12665.
  3. A large annotated corpus for learning natural language inference. In Conference on Empirical Methods in Natural Language Processing.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  5. Training verifiers to solve math word problems. ArXiv, abs/2110.14168.
  6. The pascal recognising textual entailment challenge. In Machine Learning Challenges Workshop.
  7. Simcse: Simple contrastive learning of sentence embeddings. ArXiv, abs/2104.08821.
  8. Demystifying prompts in language models via perplexity estimation. ArXiv, abs/2212.04037.
  9. In-context demonstration selection with cross entropy difference. ArXiv, abs/2305.14726.
  10. Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. ArXiv, abs/1705.03551.
  11. Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:453–466.
  12. Mind the gap: Assessing temporal generalization in neural language models. In Neural Information Processing Systems.
  13. Diverse demonstrations improve in-context compositional generalization. ArXiv, abs/2212.06800.
  14. What makes good in-context examples for gpt-3333? arXiv preprint arXiv:2101.06804.
  15. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
  16. Conditional set generation using seq2seq models. In Conference on Empirical Methods in Natural Language Processing.
  17. Quest: A retrieval dataset of entity-seeking queries with implicit set operations. arXiv preprint arXiv:2305.11694.
  18. Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint arXiv:2202.12837.
  19. Ambigqa: Answering ambiguous open-domain questions. arXiv preprint arXiv:2004.10645.
  20. Pouya Pezeshkpour and Estevam Hruschka. 2023. Large language models sensitivity to the order of options in multiple-choice questions. ArXiv, abs/2308.11483.
  21. Learning to retrieve prompts for in-context learning. arXiv preprint arXiv:2112.08633.
  22. Zhihong Shao and Minlie Huang. 2022. Answering open-domain multi-answer questions via a recall-then-verify framework. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1825–1838, Dublin, Ireland. Association for Computational Linguistics.
  23. ASQA: Factoid questions meet long-form answers. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8273–8288, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  24. Answering ambiguous questions via iterative prompting. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7669–7683, Toronto, Canada. Association for Computational Linguistics.
  25. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  26. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652.
  27. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682.
  28. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  29. An explanation of in-context learning as implicit bayesian inference. In International Conference on Learning Representations.
  30. Compositional exemplars for in-context learning. arXiv preprint arXiv:2302.05698.
  31. Explanation selection using unlabeled data for in-context learning. In Proceedings of EMNLP.
  32. Michael Zhang and Eunsol Choi. 2021. SituatedQA: Incorporating extra-linguistic contexts into QA. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7371–7387, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  33. Opt: Open pre-trained transformer language models. ArXiv, abs/2205.01068.
  34. Calibrate before use: Improving few-shot performance of language models. In International Conference on Machine Learning.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yoonsang Lee (31 papers)
  2. Pranav Atreya (8 papers)
  3. Xi Ye (33 papers)
  4. Eunsol Choi (76 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com