Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 45 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 96 tok/s Pro

Kimi K2 206 tok/s Pro

GPT OSS 120B 457 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training (2509.04796v1)

Published 5 Sep 2025 in cs.CL

Abstract: LLMs increasingly rely on synthetic data due to human-written content scarcity, yet recursive training on model-generated outputs leads to model collapse, a degenerative process threatening factual reliability. We define knowledge collapse as a distinct three-stage phenomenon where factual accuracy deteriorates while surface fluency persists, creating "confidently wrong" outputs that pose critical risks in accuracy-dependent domains. Through controlled experiments with recursive synthetic training, we demonstrate that collapse trajectory and timing depend critically on instruction format, distinguishing instruction-following collapse from traditional model collapse through its conditional, prompt-dependent nature. We propose domain-specific synthetic training as a targeted mitigation strategy that achieves substantial improvements in collapse resistance while maintaining computational efficiency. Our evaluation framework combines model-centric indicators with task-centric metrics to detect distinct degradation phases, enabling reproducible assessment of epistemic deterioration across different LLMs. These findings provide both theoretical insights into collapse dynamics and practical guidance for sustainable AI training in knowledge-intensive applications where accuracy is paramount.

Collections

Summary

The paper characterizes knowledge collapse across three distinct stages, revealing that LLMs maintain fluency even as factual accuracy deteriorates.
It employs a cyclical synthetic training framework with varying synthetic ratios to measure entropy decline, vocabulary narrowing, and accuracy loss.
Domain-specific synthetic training is introduced as a mitigation strategy that significantly delays collapse and preserves semantic fidelity.

Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training

Introduction

This paper provides a rigorous empirical and methodological analysis of "knowledge collapse" in LLMs subjected to recursive synthetic training. The authors distinguish knowledge collapse from general model collapse by focusing on the phenomenon where factual accuracy degrades while surface fluency and instruction-following persist, resulting in "confidently wrong" outputs. The paper is motivated by the increasing reliance on synthetic data for LLM training due to the scarcity of high-quality human-written content and the contamination of internet sources with AI-generated text. The work systematically characterizes the collapse process, its conditional dependence on prompt format, and proposes domain-specific synthetic training as a mitigation strategy.

Experimental Framework

The authors employ a cyclical workflow for recursive synthetic training, iteratively generating datasets, evaluating models, and fine-tuning across multiple generations.

Figure 1: Cyclical workflow for recursive synthetic training: dataset generation, dual evaluation, and fine-tuning repeated across generations.

Training is performed on the GEMMA 3 1B IT model using WikiText-2 as the base corpus, with synthetic fractions $\alpha \in \{0.25, 0.50, 1.0\}$ controlling the proportion of synthetic data. Evaluation is conducted on five MMLU subjects reformatted to short-answer style to isolate factual retention. The framework combines model-centric metrics (perplexity, entropy, gibberish score) and task-centric metrics (accuracy, greedy rate, option bias, judge score, entailment score) to detect and characterize collapse phases.

Three-Stage Knowledge Collapse

The central contribution is the definition and empirical validation of a three-stage knowledge collapse process:

Stage A (Knowledge Preservation): High factual accuracy and instruction adherence.
Stage B (Knowledge Collapse): Factual accuracy deteriorates, but models maintain surface fluency and task format, producing "confidently wrong" outputs.
Stage C (Instruction-following Collapse): Complete breakdown with accuracy approaching random baselines and incoherent outputs.
Figure 2: Knowledge collapse stages across synthetic ratios showing three critical metrics and sample evolution.

The transition from Stage A to Stage B is particularly hazardous for downstream applications, as models remain fluent and confident while factual reliability erodes. The onset and trajectory of collapse are strongly modulated by the synthetic data ratio: lower ratios delay collapse, while higher ratios accelerate it. Distributional analysis reveals rapid entropy decline and vocabulary narrowing at high synthetic ratios, with lower ratios preserving discriminative capability despite accuracy loss.

Instruction Format Sensitivity

A key finding is that collapse dynamics are conditional on instruction format rather than being prompt-agnostic. Short-answer prompts maintain stability longer, few-shot prompts collapse rapidly, and zero-shot prompts exhibit intermediate degradation.

Figure 3: Instruction format sensitivity in knowledge collapse showing conditional degradation patterns.

Statistical analysis (two-way ANOVA) confirms significant main effects and interactions between instruction format and generation, with few-shot prompts exhibiting the earliest collapse. This demonstrates that prompt structure mediates the vulnerability of models to synthetic data artifacts, with complex formats amplifying systematic biases and accelerating collapse.

Distributional and Semantic Degradation

Distributional indicators across generations show that high synthetic ratios induce rapid vocabulary narrowing, increased gibberish scores, and divergence between external validation and internal consistency.

Figure 4: Distributional indicators across generations under different synthetic ratios.

Semantic fidelity metrics (judge and entailment scores) further validate the three-stage framework. Lower synthetic ratios show gradual semantic drift, while 100% synthetic regimes exhibit rapid collapse and entailment illusions, where verbose gibberish creates spurious lexical overlap with gold references.

Figure 5: Judge and entailment scores across generations. The 100% synthetic regime shows rapid semantic degradation, while lower ratios exhibit gradual drift, supporting the Stage B "confidently wrong" phenomenon.

Domain-Specific Synthetic Training as Mitigation

The authors propose domain-specific synthetic training as a targeted mitigation strategy. By restricting synthetic data generation to a subject-aligned corpus (World Religions), they achieve a 15× improvement in collapse resistance compared to general synthetic training. Domain-specific models maintain stable accuracy, controlled perplexity growth, and entropy preservation, while general models exhibit rapid degradation.

Figure 6: Domain-specific mitigation through WikiText filtering to focus on World Religions content demonstrates superior collapse resistance.

Distributional analysis confirms that domain-specific training preserves lexical diversity and semantic coherence, delaying both knowledge and instruction-following collapse.

Figure 7: Domain-specific training preserves distributional stability through entropy maintenance and controlled perplexity growth.

Behavioral analysis shows that domain-aligned models maintain superior accuracy, confidence dynamics, and semantic quality across generations.

Figure 8: Domain-specific training maintains superior behavioral stability and semantic quality.

Implementation Considerations

The experimental setup is fully reproducible using publicly available datasets (WikiText-2, MMLU) and the GEMMA 3 1B IT model. Training is performed with light-touch updates (0.5 epochs) to enable gradual drift observation. Synthetic data generation employs top-k and top-p nucleus sampling to encourage diversity. Evaluation leverages both automated metrics and LLM-as-judge scoring for semantic fidelity. Domain-specific corpus construction utilizes semantic filtering and reranking with Sentence-BERT and cross-encoder models.

Resource requirements are modest: single GPU (RTX 3090 Ti, 24 GB), with each generation requiring approximately one hour of compute. The framework is scalable to larger models and domains, though the mitigation benefits of domain-specific training are domain-bound and do not generalize to out-of-domain evaluation.

Implications and Future Directions

The identification of knowledge collapse as a distinct, prompt-conditional, three-stage process has significant implications for the sustainable training and deployment of LLMs in accuracy-critical domains. The persistence of fluency and confidence during factual degradation (Stage B) poses safety risks, particularly in applications such as healthcare, law, and education. The demonstrated effectiveness of domain-specific synthetic training suggests that distributional anchoring can delay collapse, but does not eliminate it.

Theoretical implications include the breakdown of neural scaling laws under recursive synthetic training and the need for collapse-aware training protocols. Future research should extend the analysis to larger model scales, more diverse domains, and reasoning-heavy tasks. Predictive frameworks for early detection of collapse onset and adaptive mitigation strategies are critical for robust LLM lifecycle management.

Conclusion

This paper provides a comprehensive characterization of knowledge collapse in LLMs under recursive synthetic training, distinguishing it from general model collapse by its conditional, prompt-dependent nature and the persistence of fluency during factual degradation. The three-stage framework, combined with rigorous empirical analysis and domain-specific mitigation, offers actionable guidance for sustainable AI training in knowledge-intensive applications. The findings underscore the necessity of prompt-aware evaluation and domain-aligned synthetic data strategies to preserve factual reliability in future LLM development.