Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Compositional Exemplars for In-context Learning (2302.05698v3)

Published 11 Feb 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Large pretrained LLMs (LMs) have shown impressive In-Context Learning (ICL) ability, where the model learns to do an unseen task via a prompt consisting of input-output examples as the demonstration, without any parameter updates. The performance of ICL is highly dominated by the quality of the selected in-context examples. However, previous selection methods are mostly based on simple heuristics, leading to sub-optimal performance. In this work, we formulate in-context example selection as a subset selection problem. We propose CEIL (Compositional Exemplars for In-context Learning), which is instantiated by Determinantal Point Processes (DPPs) to model the interaction between the given input and in-context examples, and optimized through a carefully-designed contrastive learning objective to obtain preference from LMs. We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing. Extensive experiments demonstrate not only the state-of-the-art performance but also the transferability and compositionality of CEIL, shedding new light on effective and efficient in-context learning. Our code is released at https://github.com/HKUNLP/icl-ceil.

Compositional Exemplars for In-context Learning

The paper "Compositional Exemplars for In-context Learning" addresses the intricacies of selecting in-context examples for large pre-trained LLMs (LMs) during in-context learning (ICL). The paper introduces Compositional Exemplars for In-context Learning (CEIL), a novel approach utilizing Determinantal Point Processes (DPPs) to enhance the selection process of demonstration examples used when prompting LLMs for unseen tasks, requiring no parameter updates.

Key Contributions

  1. Reformulating In-context Example Selection: The paper proposes rethinking in-context example selection as a subset selection problem, arguing that the interaction between examples is crucial for performance. Through DPPs, the authors formulate a joint probability model for example sets, therefore incorporating interactions between examples that traditional independent selection methods ignore.
  2. Learning from Contrastive Objectives: By incorporating contrastive learning, the DPPs are refined to prefer more contextually appropriate example subsets. The model is trained using subsets annotated with scores reflecting the example's utility in improving output accuracy, as judged by the LLM itself.
  3. Performance on Diverse NLP Tasks: CEIL was validated across 12 datasets spanning 7 tasks, showcasing superior state-of-the-art performance in tasks such as sentiment analysis, semantic parsing, and more. Notable gains were observed in complex tasks like natural language inference, where understanding the nuanced interrelationships between examples can be crucial.
  4. Transferability and Compositionality: Beyond obtaining high accuracy, CEIL demonstrated robustness in transferring learned preferences across different LLMs and datasets, which is practically advantageous, reducing the need for task-specific retraining. Additionally, the approach showed promise in handling compositional tasks that require dynamic adaption of examples to generate suitable decomposed representations.

Implications and Speculations for the Future

The findings imply that example interrelationship modeling is a significant factor in optimizing in-context learning for LLMs. As LMs continue to expand in both scale and capability, methodologies like CEIL will be pivotal in maintaining efficiency and effectiveness in real-world applications, where parameters or architectures of models can often remain static due to technical or infrastructural constraints.

Future work could further investigate the domain adaptability of CEIL and enhance its efficiency for real-time applications. Additionally, exploring alternative contrastive frameworks or scaling CEIL for even larger context sizes, which new-generation LMs can support, holds potential for uncovering broader applications and understanding of ICL dynamics.

Conclusion

The research detailed in "Compositional Exemplars for In-context Learning" highlights an innovative step forward in optimizing example selection for in-context learning. The introduction of CEIL provides a robust framework that considers both diversity and relevance within the exemplar set, addressing limitations seen in previous heuristic-based approaches. As a result of its advanced techniques and comprehensive validation, CEIL establishes a refined benchmark in the ongoing efforts to enhance LLMs via effective contextual demonstrations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. In-context examples selection for machine translation. arXiv preprint arXiv:2212.02437, 2022.
  2. Cont: Contrastive neural text generation. NeurIPS, 2022.
  3. Task-oriented dialogue as dataflow synthesis. Transactions of the Association for Computational Linguistics, 8:556–571, 2020.
  4. Learning detection with diverse proposals. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  7149–7157, 2017.
  5. Semantic parsing on Freebase from question-answer pairs. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.  1533–1544, Seattle, Washington, USA, October 2013. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/D13-1160.
  6. GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow, March 2021. URL https://doi.org/10.5281/zenodo.5297715.
  7. Language models are few-shot learners. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. URL https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html.
  8. Fast greedy map inference for determinantal point process to improve recommendation diversity. Advances in Neural Information Processing Systems, 31, 2018.
  9. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021a.
  10. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021b.
  11. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp.  4171–4186, Minneapolis, Minnesota, 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423.
  12. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics, pp.  350–356, 2004.
  13. Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science (New York, NY), 378(6624):1067–1074, 2022.
  14. Improving text-to-SQL evaluation methodology. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  351–360, Melbourne, Australia, July 2018. Association for Computational Linguistics. doi: 10.18653/v1/P18-1033. URL https://aclanthology.org/P18-1033.
  15. The pile: An 800gb dataset of diverse text for language modeling. ArXiv preprint, abs/2101.00027, 2021a. URL https://arxiv.org/abs/2101.00027.
  16. Simcse: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.  6894–6910, 2021b.
  17. Near-optimal map inference for determinantal point processes. Advances in Neural Information Processing Systems, 25, 2012.
  18. Diverse sequential subset selection for supervised video summarization. Advances in neural information processing systems, 27, 2014.
  19. Faster greedy map inference for determinantal point processes. In International Conference on Machine Learning, pp. 1384–1393. PMLR, 2017.
  20. Question decomposition with dependency graphs. In 3rd Conference on Automated Knowledge Base Construction, 2021.
  21. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  9729–9738, 2020.
  22. Activitynet: A large-scale video benchmark for human activity understanding. In 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp.  961–970. IEEE, 2015.
  23. The curious case of neural text degeneration. In International Conference on Learning Representations, 2019.
  24. Towards unsupervised dense information retrieval with contrastive learning. arXiv preprint arXiv:2112.09118, 2021.
  25. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.  6769–6781, 2020.
  26. Adam: A method for stochastic optimization. In Proceedings of ICLR, 2015.
  27. An exact algorithm for maximum entropy sampling. Operations Research, 43(4):684–691, 1995.
  28. k-dpps: Fixed-size determinantal point processes. In ICML, 2011.
  29. Determinantal point processes for machine learning. Foundations and Trends® in Machine Learning, 5(2–3):123–286, 2012.
  30. Kulis, B. et al. Metric learning: A survey. Foundations and Trends® in Machine Learning, 5(4):287–364, 2013.
  31. Diverse demonstrations improve in-context compositional generalization. arXiv preprint arXiv:2212.06800, 2022.
  32. Mtop: A comprehensive multilingual task-oriented semantic parsing benchmark. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 2950–2962, 2021.
  33. On the advance of making language models better reasoners. arXiv preprint arXiv:2206.02336, 2022.
  34. Nl2bash: A corpus and semantic parser for natural language interface to the linux operating system. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation LREC 2018, Miyazaki (Japan), 7-12 May, 2018., 2018.
  35. What makes good in-context examples for gpt-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pp.  100–114, 2022.
  36. Liu, T.-Y. et al. Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval, 3(3):225–331, 2009.
  37. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  8086–8098, 2022.
  38. Webgpt: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332, 2021.
  39. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
  40. OpenAI, T. Chatgpt: Optimizing language models for dialogue. OpenAI, 2022.
  41. Improving compositional generalization with latent structure and data augmentation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  4341–4362, Seattle, United States, July 2022a. Association for Computational Linguistics. doi: 10.18653/v1/2022.naacl-main.323. URL https://aclanthology.org/2022.naacl-main.323.
  42. Evaluating the impact of model scale for compositional generalization in semantic parsing. arXiv preprint arXiv:2205.12253, 2022b.
  43. Language models are unsupervised multitask learners. 2019.
  44. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends in Information Retrieval, 3:333–389, 01 2009. doi: 10.1561/1500000019.
  45. Movie description. International Journal of Computer Vision, 123:94–120, 2017.
  46. Learning to retrieve prompts for in-context learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  2655–2671, Seattle, United States, July 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.naacl-main.191. URL https://aclanthology.org/2022.naacl-main.191.
  47. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, 2002.
  48. Compositional generalization and natural language variation: Can a semantic parsing approach handle both? In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.  922–938, Online, August 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.acl-long.75. URL https://aclanthology.org/2021.acl-long.75.
  49. Natural language to code translation with execution. arXiv preprint arXiv:2204.11454, 2022.
  50. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pp.  1631–1642, 2013.
  51. Selective annotation makes language models better few-shot learners. arXiv preprint arXiv:2209.01975, 2022.
  52. CommonsenseQA: A question answering challenge targeting commonsense knowledge. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp.  4149–4158, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1421. URL https://aclanthology.org/N19-1421.
  53. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  54. Glue: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp.  353–355, 2018.
  55. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171, 2022.
  56. Emergent abilities of large language models. Transactions on Machine Learning Research, 2022.
  57. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp.  1112–1122, 2018.
  58. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp.  38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
  59. Break it down: A question understanding benchmark. Transactions of the Association for Computational Linguistics, 8:183–198, 2020.
  60. Self-adaptive in-context learning. arXiv preprint arXiv:2212.10375, 2022.
  61. Deep determinantal point process for large-scale multi-label classification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017.
  62. ProGen: Progressive zero-shot dataset generation via in-context feedback. In Findings of the Association for Computational Linguistics: EMNLP 2022, pp.  3671–3683, Abu Dhabi, United Arab Emirates, December 2022a. Association for Computational Linguistics. URL https://aclanthology.org/2022.findings-emnlp.269.
  63. Generating data for symbolic language with large language models. 2023.
  64. Complementary explanations for effective in-context learning. arXiv preprint arXiv:2211.13892, 2022b.
  65. Compositional generalization for neural semantic parsing via span-level supervised attention. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  2810–2823, 2021.
  66. Learning to parse database queries using inductive logic programming. In AAAI/IAAI, pp.  1050–1055, Portland, OR, August 1996. AAAI Press/MIT Press. URL http://www.cs.utexas.edu/users/ai-lab?zelle:aaai96.
  67. Hellaswag: Can a machine really finish your sentence? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.
  68. Extractive summarization as text matching. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  6197–6208, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.552. URL https://aclanthology.org/2020.acl-main.552.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jiacheng Ye (21 papers)
  2. Zhiyong Wu (171 papers)
  3. Jiangtao Feng (24 papers)
  4. Tao Yu (282 papers)
  5. Lingpeng Kong (134 papers)
Citations (88)
Github Logo Streamline Icon: https://streamlinehq.com