Explainable Automated Fact-Checking for Public Health Claims
The paper "Explainable Automated Fact-Checking for Public Health Claims" by Kotonya and Toni addresses the limitations in the current landscape of automated fact-checking models, particularly focusing on two prominent issues: the narrow emphasis on political claims and the deficiency of explainable systems. The research posits the necessity for expertise-driven fact-checking models capable of addressing claims in complex domains like public health. To substantiate their research, the authors introduce the PubHealth dataset, comprising 11.8K public health-related claims paired with journalist-crafted explanations that delineate the rationale behind each claim's veracity label.
Objectives and Contributions
The paper delineates two primary research tasks: veracity prediction and explanation generation. The authors aim to develop a system capable of not only predicting the veracity of claims but also providing understandable explanations, particularly for audiences lacking domain-specific knowledge. The paper's key contributions include:
- Dataset Creation: The introduction of PubHealth, a novel dataset tailored for the evaluation of fact-checking systems in the public health domain, distinguished by its incorporation of gold-standard journalistic explanations.
- Methodological Framework: The authors present a framework leveraging domain-specific training data to enhance the performance of fact-checking models in veracity prediction and explanation generation.
- Coherence Properties for Explanation Evaluation: The paper proposes three coherence properties—strong global coherence, weak global coherence, and local coherence—crafted for evaluating the fidelity of explanations provided by the system.
Methodological Innovations
The approach utilizes advanced NLP techniques for both the prediction and explanation generation tasks. For veracity prediction, models like SciBERT and BioBERT are fine-tuned using in-domain data to enhance accuracy. Moreover, Sentence-BERT (S-BERT) is employed for the task of evidence selection prior to veracity prediction. Meanwhile, explanation generation is modeled using a joint extractive-abstractive summarization method, to ensure explanations are both contextually relevant and comprehensible to non-expert audiences.
Evaluation and Results
The paper employs comprehensive evaluation strategies, encompassing both human and automated approaches. The veracity prediction models, particularly SciBERT and BioBERT v1.1, demonstrated superior performance with significant improvements in F1, precision, and accuracy measures over non-domain-specific models. In explanation generation, the ExplainerFC-Expert model trained with domain-specific data outperformed general models based on ROUGE metrics. Crucially, the paper underscores the effectiveness of using Natural Language Inference (NLI) models to approximate human evaluation of explanation coherence, with a particular utility observed in weak global coherence and local coherence properties.
Implications and Future Directions
The insights from this paper underscore the potential of domain-specific models trained on tailored datasets to significantly enhance both the predictive power and trustworthiness of automated fact-checking systems. The research presents a compelling case for the broader adoption of explainable AI systems, particularly in domains where the implications of misinformation can have severe real-world consequences, such as public health.
Looking forward, the authors suggest expanding this research agenda to encompass additional specialized domains. Moreover, they advocate for a deeper exploration of the alignment between veracity predictions and the quality of generated explanations, which could further bolster the credibility and utility of automated fact-checking systems.
Understanding and improving the coherence and comprehensibility of AI-generated explanations continue to be pivotal challenges in fostering trust and efficacy in AI systems, which this research systematically begins to address.