Learning from Context or Names? An Empirical Study on Neural Relation Extraction (2010.01923v2)

Published 5 Oct 2020 in cs.CL

Abstract: Neural models have achieved remarkable success on relation extraction (RE) benchmarks. However, there is no clear understanding which type of information affects existing RE models to make decisions and how to further improve the performance of these models. To this end, we empirically study the effect of two main information sources in text: textual context and entity mentions (names). We find that (i) while context is the main source to support the predictions, RE models also heavily rely on the information from entity mentions, most of which is type information, and (ii) existing datasets may leak shallow heuristics via entity mentions and thus contribute to the high performance on RE benchmarks. Based on the analyses, we propose an entity-masked contrastive pre-training framework for RE to gain a deeper understanding on both textual context and type information while avoiding rote memorization of entities or use of superficial cues in mentions. We carry out extensive experiments to support our views, and show that our framework can improve the effectiveness and robustness of neural models in different RE scenarios. All the code and datasets are released at https://github.com/thunlp/RE-Context-or-Names.

Authors (8)

Hao Peng (291 papers)
Tianyu Gao (35 papers)
Xu Han (270 papers)
Yankai Lin (125 papers)
Peng Li (390 papers)
Zhiyuan Liu (433 papers)
Maosong Sun (337 papers)
Jie Zhou (687 papers)

Citations (191)

View on Semantic Scholar

Summary

Analyzing the Empirical Study on Neural Relation Extraction

This paper investigates the effectiveness of context-based versus name-based strategies in neural relation extraction (RE), an essential component for understanding semantic relationships in text. The authors have constructed a pre-training dataset by aligning Wikipedia articles with Wikidata, resulting in a considerable collection consisting of 744 relations and 867,278 sentences. The paper focuses on determining whether the entities' context or their names provide a more reliable basis for relation extraction, examining the impact these factors have on the overall performance of neural models in this domain.

Methodological Framework

The research employs a pre-training approach utilizing BERT $_{\text{base}}$ architecture, with both Matching the Blanks (MTB) and a contrastive model (CP) being used. The computational experiments are conducted on various datasets, including TACRED, SemEval, Wiki80, ChemProt, and FewRel, across different settings ranging from fully supervised to few-shot learning conditions. Hyperparameters such as learning rate, batch size, and sentence length were carefully selected based on their performance on the TACRED dataset.

Remarkably, the dataset differs from previous approaches by filtering out entity pairs that do not exhibit a relationship in Wikidata, thereby focusing the learning process on pairs with explicit relational content. This decision aims to enhance training efficiency by eliminating non-informative samples.

Experimental Results and Observations

The experimental results reveal distinct advantages in leveraging relational context as opposed to the mere inclusion of entity names. The use of a contrastive objective function in training models proved beneficial in generating more accurate relational predictions under certain conditions. Notably, the research demonstrates that the best batch size for MTB in their experiment was 256, which deviates from existing literature and results in better performance metrics on TACRED.

Implications and Future Research Directions

The paper’s findings provide critical insights into the design of neural relation extraction systems. By emphasizing the value of context, the paper suggests more profound integration of contextual information into future RE methodologies. The computational efficiency and model performance offer pathways to optimize relation extraction tasks in broader and more complex NLP applications, facilitating advancements in fields reliant on structured semantic information like knowledge graph construction.

Furthermore, this work lays the groundwork for additional explorations into fine-tuning techniques and the potential extension of these models to encompass multilingual capabilities, given the multilingual nature of Wikidata. Prospective developments may also include enhancing the density and variety of relational data, potentially incorporating more sophisticated unsupervised or semi-supervised learning paradigms.

Conclusion

This empirical paper highlights the intricate dynamics between contextual and name-based learning in neural relation extraction, underscoring the contextual information's strong influence on model efficacy. The pre-training methodology and dataset filtering strategy adopted provide a robust baseline for future investigations, promoting further exploration in optimizing RE within various linguistic and domain-specific frameworks.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - thunlp/RE-Context-or-Names: Bert-based models(BERT, MTB, CP) for relation extraction. (103 stars)