How do language models learn facts? Dynamics, curricula and hallucinations (2503.21676v1)

Published 27 Mar 2025 in cs.CL and cs.LG

Abstract: LLMs accumulate vast knowledge during pre-training, yet the dynamics governing this acquisition remain poorly understood. This work investigates the learning dynamics of LLMs on a synthetic factual recall task, uncovering three key findings: First, LLMs learn in three phases, exhibiting a performance plateau before acquiring precise factual knowledge. Mechanistically, this plateau coincides with the formation of attention-based circuits that support recall. Second, the training data distribution significantly impacts learning dynamics, as imbalanced distributions lead to shorter plateaus. Finally, hallucinations emerge simultaneously with knowledge, and integrating new knowledge into the model through fine-tuning is challenging, as it quickly corrupts its existing parametric memories. Our results emphasize the importance of data distribution in knowledge acquisition and suggest novel data scheduling strategies to accelerate neural network training.

Authors (6)

Nicolas Zucchet (11 papers)
Jörg Bornschein (8 papers)
Stephanie Chan (23 papers)
Andrew Lampinen (11 papers)
Razvan Pascanu (138 papers)
Soham De (38 papers)

Summary

The acquisition of factual knowledge within LLMs during pre-training is a complex process whose underlying dynamics are not fully understood. Research investigating these dynamics using synthetic factual recall tasks provides insights into how models learn, forget, and sometimes confabulate information (Zucchet et al., 27 Mar 2025 ). This work reveals distinct learning phases, highlights the critical role of training data distribution, and examines the relationship between knowledge acquisition, hallucinations, and the challenges of post-hoc knowledge integration via fine-tuning.

Learning Dynamics and Phases

The process by which LLMs learn facts appears to follow a distinct multi-phase trajectory. Analysis based on synthetic recall tasks indicates that learning unfolds in three primary stages. Initially, the model exhibits basic learning, likely acquiring surface-level statistics or shallow correlations. This is followed by a notable performance plateau, where improvements in factual recall stagnate despite continued training. Crucially, this plateau phase is not indicative of failed learning but rather corresponds to a period of internal reorganization within the model. Mechanistically, this plateau coincides with the formation and refinement of specific attention-based circuits. These circuits are hypothesized to be essential for the precise recall of stored facts, acting as retrieval mechanisms over the parametrically encoded knowledge. Only after these circuits are sufficiently developed does the model enter the third phase, characterized by a rapid increase in performance on the factual recall task, indicating the successful acquisition and accessibility of the target knowledge.

Impact of Data Distribution and Curricula

The distribution of facts within the pre-training corpus significantly influences the observed learning dynamics. Specifically, the duration of the performance plateau is sensitive to the balance of factual instances in the training data. Experiments demonstrate that training on imbalanced distributions, where some facts appear more frequently than others, leads to a shorter plateau phase compared to training on uniformly distributed factual data. This suggests that the model can more quickly form the necessary recall mechanisms (the attention-based circuits) when the data provides stronger, albeit potentially biased, signals for certain facts. This finding has direct implications for training efficiency. It suggests that manipulating the data distribution or employing specific data scheduling strategies (curricula) during pre-training could potentially accelerate the knowledge acquisition process by shortening or even bypassing prolonged plateau phases. Designing curricula that strategically present factual information could optimize the formation of recall circuits, leading to faster convergence on knowledge-intensive tasks.

Emergence of Hallucinations and Challenges in Knowledge Integration

An important observation is the simultaneous emergence of factual recall capabilities and the propensity for hallucinations. As the model begins to successfully recall learned facts (exiting the plateau phase), it also starts generating plausible but incorrect factual statements. This suggests that the mechanisms enabling recall are closely intertwined with those that can lead to confabulation, potentially arising from partially formed or incorrectly accessed memory representations. Furthermore, the paper highlights significant challenges in integrating new factual knowledge into an already trained model via fine-tuning. While fine-tuning can introduce new information, it appears highly susceptible to rapidly corrupting existing parametric memories. Attempting to update the model's knowledge base post-pre-training can interfere with previously learned facts, leading to catastrophic forgetting or degradation of overall factual accuracy. This underscores the difficulty of maintaining the integrity of an LLM's vast, implicitly stored knowledge base during incremental updates and suggests that naive fine-tuning approaches may be insufficient for reliable knowledge editing or augmentation.

In conclusion, the learning of facts in LLMs proceeds through distinct phases linked to the formation of attention circuits, is heavily influenced by the training data distribution, and occurs alongside the emergence of hallucinations. The fragility of parametric knowledge during fine-tuning poses significant hurdles for integrating new information, emphasizing the need for more sophisticated knowledge updating techniques and careful consideration of data curricula during pre-training.