Emergent Mind

Emergent Abilities of Large Language Models

Published Jun 15, 2022 in cs.CL


Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence implies that additional scaling could further expand the range of capabilities of language models.


  • LLMs develop emergent abilities—unexpected capabilities—as they increase in size.

  • These emergent abilities display a phase transition, appearing past a certain scale and markedly improving model performance.

  • In prompt-based tasks without further training, LLMs demonstrate significant improvements only after surpassing a high threshold of parameters and computational training.

  • Researchers are exploring augmented prompting and fine-tuning techniques that become effective as LLMs scale up.

  • The study of LLMs' emergent abilities is crucial for understanding model predictability and may lead to the discovery of more sophisticated capabilities.

Emergent Abilities of LLMs


LLMs have seen remarkable progress in recent years. A fascinating phenomenon observed in these models, particularly those of a larger scale, is the development of unexpected capabilities—known as emergent abilities. These abilities, interestingly, do not manifest in smaller models but begin to appear in larger versions, presenting a performance trend defying simple extrapolations from their less sizable counterparts. Emergence in this context is defined as qualitative changes in behavior originating from quantitative increases in a system—in this case, the size of the language model as gauged by the number of parameters and computational resources expended during training.

Emergent Abilities Defined

Emergence in LLMs is evident when there's a significant leap in model performance that transcends the predictable gains seen with smaller models. A distinct attribute of emergent abilities is their phase transition-like nature. Initially, the model's performance on a task may randomize as if the model lacks the ability entirely. Then, past a certain model scale threshold, performance sharply increases. This behavior is akin to phase transitions in physics where a substantial change in state reveals non-trivial properties that were not foreseeable. Notably, most densely built Transformer LLMs follow this trend since they usually scale their computational training resources in proportion to model parameters.

Observations in Prompt-Based Tasks

The unpredictability of emergent abilities is particularly striking in prompting paradigms, where an LLM produces responses based on predefined inputs without further training modifications. A prime example is the response improvement in few-shot prompted tasks that LLMs like PaLM and GPT-3 exhibit only after reaching extremely high numbers of parameters and computational training. These improvements were recorded across a battery of such tasks, from arithmetic to transliteration, indicating a broad spectrum of emergent abilities.

Augmented Prompting Techniques

Besides the raw scaling of LLMs, researchers have also investigated various enhanced prompting and fine-tuning strategies, which may qualify as emergent abilities if they are detrimental or show no effect until applied at a certain scale. Examples include chain-of-thought prompting, which facilitates multi-step reasoning, and scratchpad methodologies that assist with sequential operations. Techniques for model calibration have also been observed to be effective only at higher scales.


The research frontier for LLMs includes identifying the limits of their emergent abilities, especially since these capabilities challenge our current understanding of model predictability. The premise is that with additional scaling, even more sophisticated abilities may emerge. However, achieving emergence might also be possible without simply increasing model scale, potentially through improved architectures, training methods, data quality, or tasks that emphasize current model weaknesses. These findings heighten the need for the computational linguistics community to delve deeper into the causality and dynamics of LLMs' emergent behaviors.

Get summaries of trending AI papers delivered straight to your inbox

Unsubscribe anytime.

Test Your Knowledge

You answered out of questions correctly.

Well done!