CLEVE: Contrastive Pre-training for Event Extraction (2105.14485v1)

Published 30 May 2021 in cs.CL

Abstract: Event extraction (EE) has considerably benefited from pre-trained LLMs (PLMs) by fine-tuning. However, existing pre-training methods have not involved modeling event characteristics, resulting in the developed EE models cannot take full advantage of large-scale unsupervised data. To this end, we propose CLEVE, a contrastive pre-training framework for EE to better learn event knowledge from large unsupervised data and their semantic structures (e.g. AMR) obtained with automatic parsers. CLEVE contains a text encoder to learn event semantics and a graph encoder to learn event structures respectively. Specifically, the text encoder learns event semantic representations by self-supervised contrastive learning to represent the words of the same events closer than those unrelated words; the graph encoder learns event structure representations by graph contrastive pre-training on parsed event-related semantic structures. The two complementary representations then work together to improve both the conventional supervised EE and the unsupervised "liberal" EE, which requires jointly extracting events and discovering event schemata without any annotated data. Experiments on ACE 2005 and MAVEN datasets show that CLEVE achieves significant improvements, especially in the challenging unsupervised setting. The source code and pre-trained checkpoints can be obtained from https://github.com/THU-KEG/CLEVE.

Authors (9)

Ziqi Wang (93 papers)
Xiaozhi Wang (51 papers)
Xu Han (270 papers)
Yankai Lin (125 papers)
Lei Hou (127 papers)
Zhiyuan Liu (433 papers)
Peng Li (390 papers)
Juanzi Li (144 papers)
Jie Zhou (688 papers)

Citations (109)

View on Semantic Scholar

Summary

The paper presents a dual-encoder contrastive pre-training framework that integrates text and graph representations to enhance event extraction.
It employs AMR-derived semantic and structural signals to cluster event triggers and arguments, improving extraction quality.
CLEVE achieves significant performance gains, reaching a 79.8% F1-score on ACE 2005 in supervised settings and robust results unsupervised.

CLEVE: Contrastive Pre-training for Event Extraction

The paper addresses the challenge of enhancing event extraction (EE) processes by proposing a novel contrastive pre-training framework called CLEVE (Contrastive Learning for Event Extraction). While pre-trained LLMs (PLMs) have significantly advanced EE through fine-tuning, current pre-training paradigms fail to explicitly incorporate event-level semantic structures, leading to suboptimal utilization of large-scale unsupervised data. CLEVE introduces a dual-component approach, integrating text and graph encoders to learn event semantics and structures, respectively, leveraging unsupervised data and their semantic frames.

Technical Approach

CLEVE’s methodology centers around contrasts derived from semantic structures such as Abstract Meaning Representation (AMR), which consist of a directed acyclic graph capturing the relationships and roles within a sentence. The authors employ a self-supervised mechanism to pre-train two central components:

Text Encoder for Event Semantics: Utilizing a PLM architecture, the text encoder is trained using contrastive learning techniques that cluster semantically related words (i.e., triggers and their arguments) closer in the embedding space using AMR-derived relations as self-supervision signals. This strategy reinforces the model's capacity to discern and represent event-specific semantics.
Graph Encoder for Event Structures: The graph encoder, based on Graph Neural Networks (GNN), is optimized through contrastive pre-training on subgraphs derived from the AMR, promoting the understanding of complex event structures. This component is specifically designed to boost the model's ability to infer structural consistency across diverse datasets.

Experimental Results

Through extensive experimentation on the ACE 2005 and MAVEN datasets, CLEVE demonstrates substantial improvements over existing methods in both supervised and unsupervised settings, particularly excel in unsupervised “liberal event extraction” where traditional methods struggle. CLEVE achieves this by combining semantic and syntactic insights, facilitating the extraction of complete event structures and types with minimal annotated guidance.

Numerical Insights

The experiments emphasize CLEVE's enhanced performance particularly under conditions of data scarcity. Notably, in supervised settings, CLEVE outperforms RoBERTa fine-tuned baselines, achieving an F1-score of 79.8% on ACE 2005 for event detection, reflecting a significant advancement. This demonstrates CLEVE's ability to generalize new event schemata more effectively than existing fore-front methods, while unsupervised settings also show improvements of up to 53.7% in F1 for event detection, highlighting its applicability in data-sensitive contexts.

Implications and Future Directions

The implications of CLEVE’s framework for the future of AI and event extraction are substantial. By integrating event-specific semantic structures into training paradigms, CLEVE introduces a path toward more robust and contextually aware information extraction systems. This approach also opens avenues for enhancing other NLP tasks requiring semantic and structural comprehension, demonstrating potential foundational shifts in how models are pre-trained for language understanding.

For future work, there is room to explore the integration of additional semantic structures beyond AMR, such as Frame Semantic structures, and to refine the alignment between semantic and structural representations for further performance gains. Moreover, exploring domain adaptation techniques within the CLEVE framework could help tailor the approach to specific application areas, enhancing its utility across different information domains.

PDF Markdown

Related Papers

GitHub

GitHub - THU-KEG/CLEVE: Source code for ACL 2021 paper "CLEVE: Contrastive Pre-training for Event Extraction" (82 stars)