Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 180 tok/s
Gemini 2.5 Pro 55 tok/s Pro
GPT-5 Medium 34 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 425 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Global Populism Database (GPD)

Updated 13 October 2025
  • GPD is a comprehensive database that systematically measures populist rhetoric in global political speeches using a 0–2 holistic grading scale.
  • It employs a detailed annotation protocol with rubric-based chain-of-thought prompting, validated through metrics like Pearson and ICC.
  • The platform enables scalable, transparent, and comparative analysis of populist framing, facilitating robust political science research.

The Global Populism Database (GPD) is a comprehensive empirical resource designed to systematically measure and compare the ideational content of populist rhetoric in political speeches given by leaders and candidates worldwide. The GPD employs a rigorous annotation protocol—rooted in holistic grading methods—to produce a cross-national, multilingual, and temporally extended panel of populism scores, supporting both foundational research and state-of-the-art computational applications in political science (Tamaki et al., 8 Oct 2025).

1. Data Sources, Structure, and Annotation Protocols

The GPD aggregates political speeches from global leaders across dozens of countries and multiple decades, with coverage including campaign, famous, international, and ceremonial addresses. The annotation protocol is based on holistic grading (HG): each speech is read in full and scored on a 0–2 continuum, where 0 corresponds to negligible populist content and 2 reflects strong and pervasive populism. Coders follow a detailed rubric and anchor speech documentation that includes multiple exemplar speeches from canonical leaders (Tony Blair, George Bush, Barack Obama, Stephen Harper, Sarah Palin, Robert Mugabe, Evo Morales, etc.). The rubric explicitly describes at least six pro-populist and pluralist criteria to guide consistent annotation. For balanced coverage along the scale, anchor speeches illustrating moderate populism are incorporated during training.

The GPD’s data structure encodes both raw texts and metadata, organizing entries by country, leader, speech type, and language. Each annotation is produced by coders trained using a protocol mirroring the chain-of-thought reasoning expected for context-sensitive, ideational content (Tamaki et al., 8 Oct 2025).

2. AI-Driven Measurement: Chain-of-Thought Prompting and Calibration

Recent work demonstrates that rubric and anchor guided chain-of-thought (CoT) prompting enables LLMs to replicate expert human coders in the GPD setting (Tamaki et al., 8 Oct 2025). Specifically, LLMs receive full documentation—including theoretical definitions, grading instructions, rubric schemas, anchor speeches with their scores, and integration instructions. The models then read a test speech, reflect on the rubric and anchor examples, and output both a score (on the 0–2 scale) and a detailed reasoning chain.

Models such as GPT-5 (high reasoning mode) and Qwen3 235B (reasoning-enabled) achieve classification accuracy on par with human coders. Evaluation uses Pearson correlation, Spearman rank, Intraclass Correlation (ICC), Lin’s Concordance Correlation (CCC), Krippendorff’s α, and Bland–Altman analysis. Calibration is assessed via regression of the form:

Humana+b×AI,\mathrm{Human} \approx a + b \times \mathrm{AI},

with a0a \approx 0 and b1b \approx 1 denoting optimal alignment (Tamaki et al., 8 Oct 2025).

3. Model Diversity, Implementation, and Error Analysis

A broad set of LLMs has been evaluated on the GPD replication task, ranging from proprietary (GPT-5) to advanced open-weight architectures (GPT-oss 120B, DeepSeek R1/R3, Qwen3 235B, Llama 4 Maverick/Scout). Model selection targets both reasoning depth (pure chain-of-thought versus minimal chain) and architectural scaling (dense and mixture-of-experts). Empirical results indicate that top reasoning-enabled LLMs display high reliability and agreement with human scores, while smaller or reasoning-deficient models suffer from larger scale compression and reduced reliability. A modest tendency for scores to regress toward the mean is observed: extremes are shifted toward the center, but ranking order is preserved (Tamaki et al., 8 Oct 2025).

When evaluated across 12 speeches from the UK, Turkey, and Montenegro, strong agreement with human coding is achieved in high-capacity, reasoning-enabled LLMs. Models with less capacity or prompt adaptation exhibit larger mean absolute error and poor scale calibration.

4. Methodological Advances and Utility for Populism Research

The GPD’s anchor-guided, holistic grading methodology provides an interpretative framework for coders and AI alike to identify latent patterns such as anti-elitism, people-centrism, and Manichaean framing. The explicit chain-of-thought prompting mimics human deliberation and ensures the model reasons through complex, context-sensitive rhetorical phenomena. The resulting automated approach is cost-effective, scalable, and language-agnostic, facilitating large-scale comparative content analysis beyond traditional keyword or dictionary methods.

The method’s transparency and reproducibility are enhanced by requiring LLMs to output detailed reasoning chains, which document the basis for each populism score and enable auditability of AI-driven grading. Adaptability is ensured by swapping rubric and anchor sets, allowing extension to related constructs (e.g., nationalist or crisis framings).

5. Global Coverage, Comparative Analysis, and Extension Potential

The GPD’s multilingual and cross-contextual span supports comparative research into both temporal and spatial variation in populist rhetoric. Its grading protocol is demonstrably robust for diverse speech types and cultural frames: anchor speech training coupled with rubric consistency enables both human and AI coders to operate reliably across shifting political landscapes.

A plausible implication is that the GPD can be extended to measure additional rhetorical constructs or be integrated with computational models such as PopBERT (Erhard et al., 2023), multi-task learning architectures (Huguet-Cabot et al., 2021), and network-based approaches (Garcia-Arteaga et al., 2021). By enabling both dynamic tracking and historical panel analysis, the GPD facilitates the paper of populism at scale and with methodological rigor.

6. Limitations and Future Directions

The GPD annotation approach yields high inter-coder reliability and strong AI-human concordance; however, modest scale compression and minor negative bias may require additional calibration when expanding to new contexts or speech genres. The annotation process is sensitive to anchor selection, rubric clarity, and documentation adaptation. Future methodological refinements may include more granular multi-label schemes, explicit sub-frame detection (e.g., host ideologies per (Erhard et al., 2023)), and hybrid quantitative-qualitative approaches.

Extending the GPD protocol to cover related phenomena—such as nationalist discourse, crisis language, or emotion-infused populism—would capitalize on the demonstrated robustness of chain-of-thought LLMs and the anchor-rubric methodology. Scalable, automated coding guided by documented reasoning chains appears poised to underpin the next era of comparative rhetorical analysis in global political research.

7. Summary Table: Key Features of the Global Populism Database (GPD)

Dimension GPD Protocol Computational Integration
Annotation method Holistic Grading (0–2 score) Chain-of-thought LLM grading
Reference materials Rubric & anchor speeches Explicit documentation adaption
Model evaluation Pearson, ICC, CCC, α Calibration regression
Comparative scope Multinational, multilingual AI-aided cross-context analysis
Reliability High for reasoning-enabled models Sensitivity to prompt & anchors
Extensibility New frames via rubric swap Cross-domain adaptation

The Global Populism Database establishes an authoritative, empirically grounded platform for the measurement and comparative analysis of populist speech at global scale. It integrates anchor-driven annotation, holistic grading, and advanced AI reasoning to yield reproducible, transparent, and scalable indicators of ideational rhetoric, facilitating robust political science research and real-time content analysis (Tamaki et al., 8 Oct 2025).

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Global Populism Database (GPD).