Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies (2410.22886v1)

Published 30 Oct 2024 in cs.CL and cs.AI

Abstract: Curriculum Learning has been a popular strategy to improve the cognitive plausibility of Small-Scale LLMs (SSLMs) in the BabyLM Challenge. However, it has not led to considerable improvements over non-curriculum models. We assess whether theoretical linguistic acquisition theories can be used to specify more fine-grained curriculum learning strategies, creating age-ordered corpora of Child-Directed Speech for four typologically distant language families to implement SSLMs and acquisition-inspired curricula cross-lingually. Comparing the success of three objective curricula (Growing, Inwards and MMM) that precisely replicate the predictions of acquisition theories on a standard SSLM architecture, we find fine-grained acquisition-inspired curricula can outperform non-curriculum baselines and performance benefits of curricula strategies in SSLMs can be derived by specifying fine-grained language-specific curricula that precisely replicate language acquisition theories.

References (35)

Authors (4)

Suchir Salhan (2 papers)
Richard Diehl Martinez (13 papers)
Paula Buttery (15 papers)
Zébulon Goriely (5 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/suchirsalhan/status/1859634039822393742

https://twitter.com/mctalentowen/status/1851987105951650279

Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies (2410.22886v1)

Summary

Related Papers

Tweets