UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification (1809.01479v5)

Published 3 Sep 2018 in cs.IR, cs.AI, cs.CL, and cs.LG

Abstract: The Fact Extraction and VERification (FEVER) shared task was launched to support the development of systems able to verify claims by extracting supporting or refuting facts from raw text. The shared task organizers provide a large-scale dataset for the consecutive steps involved in claim verification, in particular, document retrieval, fact extraction, and claim classification. In this paper, we present our claim verification pipeline approach, which, according to the preliminary results, scored third in the shared task, out of 23 competing systems. For the document retrieval, we implemented a new entity linking approach. In order to be able to rank candidate facts and classify a claim on the basis of several selected facts, we introduce two extensions to the Enhanced LSTM (ESIM).

PDF Abstract

UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification

The paper "UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification" addresses the pressing issue of automated claim verification, a vital component in combating the spread of false and misleading information on the internet. The authors detail their approach developed for the Fact Extraction and VERification (FEVER) shared task, which focused on building systems capable of verifying claims via fact extraction from raw text, specifically leveraging a dataset sourced from Wikipedia.

Overview of Methodology

This research contributions are divided into handling three critical sub-tasks inherent in claim verification systems: document retrieval, sentence selection, and recognizing textual entailment:

Document Retrieval: The authors apply an entity linking approach driven by constituency parsing and handcrafted rules. Entities in the claim are matched to Wikipedia articles through a combination of leveraging the MediaWiki API and employing entity linking techniques. This step leverages constituency parsing to extract potential entity mentions, enhancing the precision of document matching beyond traditional named entity recognition methods.
Sentence Selection: The paper proposes enhancements to the Enhanced Sequential Inference Model (ESIM) for generating ranking scores for claim-sentence pairs. The model selects sentences that hold the highest potential to constitute evidence sets from the retrieved documents. The adaptation involves rerouting the ESIM's output to predict ranking scores rather than entailment relationships directly.
Recognizing Textual Entailment: The authors further modify ESIM to handle multiple input sentences and a claim through an attention mechanism. This version of ESIM utilizes word embeddings from Glove and FastText specifically tuned for Wikipedia data, allowing more effective evaluation of numerical and categorical data, an aspect often problematic in entailment tasks.

Results and Analysis

The UKP-Athene system achieved notable results, ranking third among 23 competing systems in the FEVER shared task. Important metrics reported include document retrieval accuracy reaching as high as 93.55%, sentence selection recall at 87.10%, and demonstrating substantial improvement in the FEVER score over the baseline. The system achieved a FEVER score of 64.74%, doubling the performance against the baseline pipeline.

These results confirm the effectiveness of the entity linking approach for document retrieval and sentence selection model ensembling strategies. A detailed error analysis points out challenges such as spelling errors, missing entities, and numerical interpretation which impacted performance and provide directions for further refinement.

Implications and Future Directions

The research underscores the criticality of robust entity linking in claim verification pipelines and the importance of sophisticated sentence scoring mechanisms to construct reliable evidence sets. Future work could expand upon numerical data representation and explore deeper semantic resolutions to enhance classification accuracy in edge cases. This might involve integrating more advanced contextual embeddings or adapting the architecture to incorporate multi-modal data sources.

Overall, this paper contributes to the ongoing development of automated fact-checking approaches, advancing methods that can be applied to large-scale text data effectively. It also suggests promising lines of inquiry in AI's application to dynamic content evaluation systems where comprehensive and evidence-based validation is necessary.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Andreas Hanselowski (4 papers)
Hao Zhang (947 papers)
Zile Li (5 papers)
Daniil Sorokin (4 papers)
Benjamin Schiller (10 papers)
Claudia Schulz (14 papers)
Iryna Gurevych (264 papers)

Citations (179)

View on Semantic Scholar

UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification (1809.01479v5)