Human-scale data sufficiency for emergent compositional abilities

Ascertain whether in-context compositional generalization capabilities observed in large language models pretrained on internet-scale corpora can emerge when models are trained on developmentally plausible, human-lifetime amounts of data.

Background

The authors argue that large pretrained LLMs exhibit emergent in-context learning and compositional behaviors despite training on unstructured text. However, they note that these models are trained on orders of magnitude more data than humans receive. Establishing whether similar capabilities arise with human-scale data would directly bear on debates about the necessity of massive data for compositional generalization and on the plausibility of LLMs as models of human development.

References

However, current LLMs are trained on orders of magnitude more data than humans experience in an entire lifetime, making it unclear whether similar capabilities could emerge in models trained on a more realistic scale.

— From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks (2405.15164 - Russin et al., 24 May 2024) in Section 6.2, Implications for Human Development

Human-scale data sufficiency for emergent compositional abilities

Sponsor

Background

References

Related Problems