Dice Question Streamline Icon: https://streamlinehq.com

Sample complexity required for inductive OOCR

Determine the number of training samples required for large language models to acquire latent information via inductive out-of-context reasoning (OOCR) across representative tasks.

Information Square Streamline Icon: https://streamlinehq.com

Background

In comparing finetuning-based OOCR to in-context learning, the authors note constraints on the number of in-context examples due to context-window limits and that they did not tune for sample efficiency. They highlight the need to quantify how many training documents are necessary for OOCR to succeed.

This question is central to understanding the practical viability and costs of OOCR, since different tasks (e.g., Coins vs. Functions) may require varying amounts of evidence aggregation to infer latent variables with sufficient confidence.

References

We leave exploration of required number of samples for inductive OOCR for future work.

Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data (2406.14546 - Treutlein et al., 20 Jun 2024) in Section 3.1 (Preliminary: in-context learning)