Papers
Topics
Authors
Recent
2000 character limit reached

Inflation Narratives Dataset

Updated 15 December 2025
  • Inflation Narratives Dataset is a curated resource capturing explicit and implicit causal links between inflation and macroeconomic events in U.S. news media.
  • It aggregates annotated excerpts and sentences from sources like WSJ, NYT, NOW, and ProQuest to benchmark both traditional and LLM-based narrative detection pipelines.
  • The dataset supports empirical analyses of media framing and economic discourse, despite limitations in cross-domain transfer and annotator subjectivity.

The Inflation Narratives Dataset is a curated resource for empirical research on the extraction, classification, and quantitative paper of causal economic narratives surrounding inflation in U.S. news media. It provides annotated textual data and controlled gold standards for benchmarking both traditional and LLM-based pipelines for narrative detection, with particular emphasis on explicit and implicit causal claims about inflation, its determinants, and its effects (Schmidt et al., 18 Jun 2025, Heddaya et al., 7 Oct 2024).

1. Definition and Scope

The task of identifying “inflation narratives” entails recognizing passages in news media that express a causal relation between inflation and other macroeconomic phenomena. Two principal resources are referenced in current literature:

  • (Schmidt et al., 18 Jun 2025): Focuses on explicitly formatted economic narratives—defined as causally-linked pairs of semantically and temporally distinct events—within selected newspaper articles mentioning inflation or significant price changes from 1985–2023 (WSJ/NYT).
  • (Heddaya et al., 7 Oct 2024): Emphasizes sentence-level causal micro-narratives, where sentences are categorized according to ontology-based “cause” and “effect” labels drawn from a taxonomy of inflation-related concepts, spanning U.S. news from 1960–1980 and 2012–2023 (ProQuest; NOW).

Both datasets specifically exclude mere references to inflation without explicit or implicit causal attribution, and restrict annotation to U.S. English-language coverage. The inflation narrative annotation is thus domain-, language-, and period-specific.

2. Corpus Construction and Sampling

  • Source: Wall Street Journal and New York Times.
  • Sampling: Filtered for “inflation” or “price (increase|hike|surge)” in content.
  • Annotated Subset: 100 articles randomly selected.
  • Excerpting: Each excerpt consists of one inflation-mentioning sentence plus two preceding and two following sentences, averaging ≈5 sentences/120 words per excerpt.
  • Total Annotation Set: ≈12,000 words; 291 narrative pairs.
  • Sources:
    • Contemporary: NOW Corpus (2012–2023), 118,383 filtered articles.
    • Historical: ProQuest U.S. news (1960–80), 392,475 filtered articles.
  • Segmentation: Sentence tokenization via spaCy (NOW) and BlingFire (ProQuest).
  • Annotation Set: ≈1,000 sentences per source for training, 1,119 (NOW test), 488 (ProQuest test).

This sampling framework yields annotation units that are either multi-sentence excerpts (Schmidt et al., 18 Jun 2025) or single sentences (Heddaya et al., 7 Oct 2024), suitable respectively for structured narrative-link extraction and multi-label classification.

3. Annotation Guidelines and Ontology

  • Event structure: “Event A – causes – Event B” or “Event A – is caused by – Event B”.
  • Events: Must describe discernible occurrences, activities, conditions, policies, or future plans.
  • Wording: Verbatim recording, avoiding synonymization or tense modification.
  • Chaining/Forking: Chained or forked causality decomposed into atomic pairs.
  • Coreference: Pronouns resolved to antecedent entities.
  • Economic relevance: Only narratives with explicit economic inflation relevance coded.
  • Positive causality: Negations or “does not cause” relations excluded.
  • Causes (8): demand, supply, wage, monetary, fiscal, expect, international, other-cause.
  • Effects (11): purchase, cost, uncertain, rates, redistribution, savings, trade, cost-push, social, govt, other-effect.
  • Annotation task: Binary narrative detection (presence/absence), plus multi-label assignment from this ontology.
  • Additional tags: Temporal frame (past/present/future/na), direction (up/down/na), U.S. context.

A key distinction is that (Schmidt et al., 18 Jun 2025) encodes explicit event–event causal pairs as structured triples, while (Heddaya et al., 7 Oct 2024) maps sentences to an ontology for multi-label statistical categorization.

4. Dataset Structure and Formats

Resource Unit Annotation Fields Size
(Schmidt et al., 18 Jun 2025) Article excerpt {Event A, causal connector, Event B} (JSON triple) 100 excerpts, 291 narratives
(Heddaya et al., 7 Oct 2024) Sentence contains-narrative (bool); “narratives”: [{cause/effect, time, dir}] ≈2,600 sentences annotated

(Schmidt et al., 18 Jun 2025) Example Entry:

1
2
3
4
5
{
  "Event A": "the FOMC raised the policy rate",
  "causal connector": "causes",
  "Event B": "turmoil on Wall Street"
}

(Heddaya et al., 7 Oct 2024) Example Entry (editor’s term; see App. D for exact format):

1
2
3
4
5
6
7
8
{
  "contains-narrative": true,
  "narratives": [
    {"cause": "monetary", "time": "present", "direction": "up"}
  ],
  "source": "NOW",
  "raw_text": "Higher central bank rates have pushed up inflation expectations."
}
Files are distributed in JSON, with CSV metadata (excerpt text, document ID, counts) available in (Schmidt et al., 18 Jun 2025). Conversion to tabular format is straightforward.

5. Annotation Workflow and Agreement

  • Three experts: Independent full-corpus coding.
  • Codebook refinement: Via iterative calibration meetings.
  • Adjudication: Inclusion of any narrative unanimously judged correct.
  • Agreement metrics: Pairwise Jaccard similarity (J≈0.59–0.60), expert-vs-gold accuracy (0.67–0.74).
  • Interface: All 19 labels visible with definitions; only explicit/implicit causal claims coded.
  • Agreement: Krippendorff’s α (MASI-weighted): binary detection, 0.80 (ProQuest), 0.67 (NOW); multi-label, 0.66 (ProQuest), 0.59 (NOW).
  • Observed confusion: Disagreement often on presence/absence of any narrative, especially for “social”, rather than on label choice.

Both studies demonstrate substantial but imperfect agreement; subjectivity and ambiguity remain substantial constraints.

6. Extraction Methods and Model Performance

LLM and Fine-tuned Model Pipelines

  • (Schmidt et al., 18 Jun 2025):
    • Model: GPT-4o (snapshot gpt-4o-2024-11-20, T=0.2).
    • Prompting: Few-shot Chain-of-Thought with 7 curated exemplars.
    • Extraction sequence: Discard extracorpus, isolate first two-event chain, restate causal link in JSON, resolve coreference, finalize event phrasing.
    • Post-processing: Fork decomposition, LLM-based valence/topic clustering, normalization.
    • Metrics: Accuracy (0.44, vs. 0.67–0.74 for experts); Jaccard similarity (0.40 vs. experts’ 0.59–0.60); model over-extracts in narrative-sparse, under-extracts in narrative-dense texts.
  • (Heddaya et al., 7 Oct 2024):
    • Best-performing model: LLaMA 3.1 8B, fine-tuned.
    • F₁: 0.87 (NOW, binary), 0.71 (NOW, multi-label); 0.78/0.62 (ProQuest).
    • Comparators: Phi-2 (fine-tuned) and GPT-4o (few-shot), lower F₁.
    • Cross-domain F₁ drop: 3–4% absolute loss for multitask transfer; up to 11% gain on detection.
    • Major sources of error: Linguistic ambiguity, model confusion paralleling annotator disagreement, false chains.

A plausible implication is that controlled fine-tuning with a rich ontology and clear interface yields higher detection/classification fidelity than zero/few-shot LLM prompting.

7. Applications, Access, and Limitations

Use Cases

  • Tracking diachronic or cross-outlet shifts in inflation narratives.
  • Construction of narrative-intensity indices for macroeconomic modeling.
  • Empirical studies of media framing and economic discourse.
  • Benchmarking LLMs and model pipelines for interpretability in social science contexts.

Access

Limitations

  • Domain-, language-, and corpus-specificity restricts generalizability.
  • (Schmidt et al., 18 Jun 2025): Small annotated sample (N=100 excerpts, 291 narratives); non-public LLM (GPT-4o) model pipeline limits replicability.
  • (Heddaya et al., 7 Oct 2024): Out-of-domain and cross-period transfer validity only partially explored.
  • Both: Annotator subjectivity, high inter-coder variability, imperfect model-human alignment.

8. Conclusion

The Inflation Narratives Dataset, as defined by (Schmidt et al., 18 Jun 2025) and (Heddaya et al., 7 Oct 2024), constitutes a rigorously annotated, publicly available resource for studying causal claims about inflation in U.S. media corpora. It provides gold-standard annotation protocols, detailed schema for both structured and multi-label classification, and benchmarks for both expert and LLM-driven extraction processes. The resource underpins current methodological advances in computational narrative analysis and supports a spectrum of applications in empirical economics, computational social science, and AI-driven media analytics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Inflation Narratives Dataset.