Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

In-Context Learning with Many Demonstration Examples (2302.04931v1)

Published 9 Feb 2023 in cs.CL and cs.AI

Abstract: Large pre-training LLMs (PLMs) have shown promising in-context learning abilities. However, due to the backbone transformer architecture, existing PLMs are bottlenecked by the memory and computational cost when scaling up to a large context size, leaving instruction tuning and in-context learning of many demonstration examples, as well as long-range LLMing under-explored. In this study, we propose a long-range LLM EVALM based on an efficient transformer mechanism. EVALM is trained with 8k tokens per batch line and can test up to 256k-lengthed contexts with extrapolation, 128 times to the limit of existing PLMs (e.g. GPT3). Based on EVALM, we scale up the size of examples efficiently in both instruction tuning and in-context learning to explore the boundary of the benefits from more annotated data. Experimental results on a diverse set of tasks show that EVALM achieves 4.1% higher accuracy on average, and the average length of achieving the best accuracy score over tasks is around 12k. We find that in-context learning can achieve higher performance with more demonstrations under many-shot instruction tuning (8k), and further extending the length of instructions (16k) can further improve the upper bound of scaling in-context learning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Mukai Li (17 papers)
  2. Shansan Gong (14 papers)
  3. Jiangtao Feng (24 papers)
  4. Yiheng Xu (20 papers)
  5. Jun Zhang (1008 papers)
  6. Zhiyong Wu (171 papers)
  7. Lingpeng Kong (134 papers)
Citations (26)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com