Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Compositional Generalization Gap of In-Context Learning (2211.08473v1)

Published 15 Nov 2022 in cs.CL and cs.LG

Abstract: Pretrained large generative LLMs have shown great performance on many tasks, but exhibit low compositional generalization abilities. Scaling such models has been shown to improve their performance on various NLP tasks even just by conditioning them on a few examples to solve the task without any fine-tuning (also known as in-context learning). In this work, we look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning. In the ID settings, the demonstrations are from the same split (test or train) that the model is being evaluated on, and in the OOD settings, they are from the other split. We look at how the relative generalization gap of in-context learning evolves as models are scaled up. We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets, CFQ, SCAN and GeoQuery with different number of exemplars, and observe a trend of decreasing relative generalization gap as models are scaled up.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Arian Hosseini (13 papers)
  2. Ankit Vani (8 papers)
  3. Dzmitry Bahdanau (46 papers)
  4. Alessandro Sordoni (53 papers)
  5. Aaron Courville (201 papers)
Citations (23)

Summary

We haven't generated a summary for this paper yet.