Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims (2012.00614v2)

Published 1 Dec 2020 in cs.CL and cs.AI

Abstract: We introduce CLIMATE-FEVER, a new publicly available dataset for verification of climate change-related claims. By providing a dataset for the research community, we aim to facilitate and encourage work on improving algorithms for retrieving evidential support for climate-specific claims, addressing the underlying language understanding challenges, and ultimately help alleviate the impact of misinformation on climate change. We adapt the methodology of FEVER [1], the largest dataset of artificially designed claims, to real-life claims collected from the Internet. While during this process, we could rely on the expertise of renowned climate scientists, it turned out to be no easy task. We discuss the surprising, subtle complexity of modeling real-world climate-related claims within the \textsc{fever} framework, which we believe provides a valuable challenge for general natural language understanding. We hope that our work will mark the beginning of a new exciting long-term joint effort by the climate science and AI community.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Thomas Diggelmann (1 paper)
  2. Jordan Boyd-Graber (68 papers)
  3. Jannis Bulian (14 papers)
  4. Massimiliano Ciaramita (15 papers)
  5. Markus Leippold (24 papers)
Citations (167)

Summary

climate-fever: A Dataset for Verification of Real-World Climate Claims

The paper presents an innovative effort to address the challenge of misinformation concerning climate change through the development of the climate-fever dataset. This work aligns the algorithms for fact-checking real-world claims with the domain-specific requirements of climate science, adapting the Fever framework used for claims verification. The dataset is composed of 1,535 Internet-sourced claims specific to climate change topics, annotated with evidence retrieved from the English Wikipedia. The resulting dataset consists of 7,675 annotated claim-evidence pairs and can be accessed by the research community.

The process of building the climate-fever dataset is detailed, beginning with the collection of real-world claims sourced both from scientifically-informed and skeptical sites. After gathering over 3,000 claims, experts in climate science were engaged to annotate the claims and assess their veracity, resulting in a refined collection based on consensus. For evidence retrieval, a pipeline system was constructed, mirroring the Fever methodology. This involved document-level retrieval using entity-linking, sentence-level retrieval employing dense vector embeddings powered by a Siamese ALBERT model, and subsequent sentence re-ranking.

Apart from supporting the development of evidence-retrieval systems, the dataset provides a baseline for evaluating claim validation algorithms. According to evaluation results, the Fever-trained entailment predictor showed only a 38.78% label-accuracy on the climate-fever dataset. This underscores the significant difference in complexity and subtleties associated with interpreting real-world climate claims compared to artificially constructed ones. For instance, statements in climate-fever might involve nuanced metrics or disputed assertions not typically found in traditional datasets.

The dataset presents several implications for both research communities and practical applications; chiefly, improving the robustness and accuracy of machine learning models in fact-checking tasks, especially within the nuanced context of environmental science. The low inter-annotator agreement level suggests complexities inherent to real-world evidence assessment and highlights the need for specialized models beyond what current Fever-trained systems offer. Future work endeavors to enhance the dataset further, including handling disputed claims and improving evidence evaluation techniques, thereby fostering collaboration between AI researchers and climate scientists.

The authors conclude by asserting the importance of continuing to develop technological solutions to support human fact-checkers, rather than replacing them, especially in addressing misinformation impacting climate policy and public understanding. Insights gained from the climate-fever project reveal opportunities for integrating interdisciplinary expertise to advance the field of automated fact-checking.