ERNIE: Enhanced Language Representation with Informative Entities (1905.07129v3)

Published 17 May 2019 in cs.CL

Abstract: Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained LLMs rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The source code of this paper can be obtained from https://github.com/thunlp/ERNIE.

PDF Abstract

ERNIE: Enhanced Language Representation with Informative Entities

Enhancing language representation models with external knowledge sources has been a continual area of investigation in NLP. The paper "ERNIE: Enhanced Language Representation with Informative Entities" presents an innovative exploration of this concept. It introduces the ERNIE model, which leverages knowledge graphs (KGs) in conjunction with large-scale textual corpora to enhance language understanding, effectively addressing some of the limitations seen in existing pre-trained LLMs like BERT.

Core Contributions

Augmented Language Representation: ERNIE is pre-trained using not only large-scale textual corpora but also KGs, making it a hybrid model that simultaneously incorporates lexical, syntactic, and knowledge-based information. This dual-sourced training approach aims to import a richer context into the language representation, thereby enhancing overall understanding.
Innovative Pre-training Task: The paper proposes a novel pre-training task, denoted as the denoising entity auto-encoder (dEA). This task involves randomly masking some token-entity alignments and subsequently predicting the corresponding entities based on aligned tokens. This mechanism is aimed at effectively injecting external knowledge into the model during the pre-training phase.
Model Architecture: ERNIE's architecture consists of a textual encoder ( $T$ -Encoder) for lexical and syntactic processing and a knowledgeable encoder ( $K$ -Encoder) to fuse token-based and entity-based information. This distinct architecture ensures that the model attentively integrates knowledge from KGs into the language representations learned from text.

Experimental Evaluation

Entity Typing

The performance of ERNIE was empirically validated on two established datasets: FIGER and Open Entity. On the distantly supervised FIGER dataset, ERNIE achieved significant gains, improving the strict accuracy metric by approximately 5 percentage points compared to BERT. This indicates that ERNIE’s integration of external knowledge helps mitigate the impact of noisy labels typical in distantly supervised datasets. On the manually annotated Open Entity dataset, ERNIE improved the micro-F1 score by 2 percentage points over BERT, highlighting ERNIE's efficacy in incorporating external knowledge for precise entity typing.

Relation Classification

For relation classification tasks, the model was tested on FewRel and TACRED datasets. ERNIE outperformed BERT on FewRel, achieving an approximate 3.4 percentage point lift in macro-F1 score. On the TACRED dataset, ERNIE showed an improvement over BERT by 2 percentage points in the F1 score, validating its superior performance in relation classification tasks aided by external knowledge.

GLUE Benchmark

To ensure that ERNIE's knowledge integration does not detract from its efficacy on standard NLP tasks, the model was assessed on the diverse tasks of the GLUE benchmark. ERNIE’s performance was found to be comparable to BERT across these tasks, affirming that the incorporation of KGs does not compromise its effectiveness on conventional NLP tasks.

Implications and Future Directions

The implications of this work are multi-dimensional:

Practical Impact: Enhanced performance on entity typing and relation classification tasks illustrates how ERNIE can improve real-world NLP applications that require nuanced understanding and precise information extraction, such as content recommendation systems, semantic search, and intelligent assistants.
Theoretical Advances: ERNIE’s innovative approach to integrating textual and knowledge-based information demonstrates new possibilities for hybrid models that effectively leverage multiple data sources for richer language representation.

Future Research Directions:

Expanding the incorporation of diverse forms of structured knowledge, beyond KGs like Wikidata, to include sources such as ConceptNet.
Adapting the principles of ERNIE to other pre-training strategies, thereby potentially enhancing models that utilize feature-based approaches like ELMo.
Extending the pre-training corpus with heuristically annotated real-world corpora to build more robust and generalizable language representation models.

Conclusion

ERNIE exemplifies a meticulous approach to integrating substantive external knowledge into LLMs. The model significantly outperforms its predecessors in specific knowledge-driven tasks while maintaining competitive performance on general NLP benchmarks. These advancements showcase a promising trajectory for future NLP research focused on multi-source language understanding and representation.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Zhengyan Zhang (46 papers)
Xu Han (270 papers)
Zhiyuan Liu (433 papers)
Xin Jiang (242 papers)
Maosong Sun (337 papers)
Qun Liu (230 papers)

Citations (1,317)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - thunlp/ERNIE: Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities" (1,410 stars)

Tweets

https://twitter.com/daidailoh/status/1937518377477767205

YouTube

Show All Videos