Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation (1911.06136v3)

Published 13 Nov 2019 in cs.CL
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

Abstract: Pre-trained language representation models (PLMs) cannot well capture factual knowledge from text. In contrast, knowledge embedding (KE) methods can effectively represent the relational facts in knowledge graphs (KGs) with informative entity embeddings, but conventional KE models cannot take full advantage of the abundant textual information. In this paper, we propose a unified model for Knowledge Embedding and Pre-trained LanguagE Representation (KEPLER), which can not only better integrate factual knowledge into PLMs but also produce effective text-enhanced KE with the strong PLMs. In KEPLER, we encode textual entity descriptions with a PLM as their embeddings, and then jointly optimize the KE and LLMing objectives. Experimental results show that KEPLER achieves state-of-the-art performances on various NLP tasks, and also works remarkably well as an inductive KE model on KG link prediction. Furthermore, for pre-training and evaluating KEPLER, we construct Wikidata5M, a large-scale KG dataset with aligned entity descriptions, and benchmark state-of-the-art KE methods on it. It shall serve as a new KE benchmark and facilitate the research on large KG, inductive KE, and KG with text. The source code can be obtained from https://github.com/THU-KEG/KEPLER.

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

The paper presents KEPLER, a unified model that integrates knowledge embedding (KE) and pre-trained LLMs (PLMs) to enhance both language representation and factual knowledge retrieval. This approach addresses the limitations inherent in PLMs, such as BERT and RoBERTa, which, while effective for linguistic tasks, do not capture factual knowledge effectively. Conversely, KE models efficiently represent relational facts from knowledge graphs (KGs) but cannot leverage the rich textual information effectively.

Methodology

KEPLER bridges the gap between PLMs and KE by extending the capabilities of PLMs to include factual knowledge. The model accomplishes this by encoding entity descriptions from a KG as entities within the PLM itself, optimizing these embeddings alongside traditional LLMing tasks.

Specifically, KEPLER combines two key objectives in its framework:

  • Knowledge Embedding (KE) Objective: This component uses entity descriptions from KGs to generate embeddings and employs a scoring function akin to TransE to train these embeddings effectively.
  • Masked LLMing (MLM) Objective: Retaining this objective from traditional PLMs ensures that KEPLER's language representations remain robust and contextually aware.

KEPLER encodes entities and text into a unified semantic space using a Transformer model, maintaining the model structure of RoBERTa to avoid additional inference complexity.

Experimental Evaluation

KEPLER was evaluated on various NLP tasks and knowledge integration scenarios, demonstrating its ability to incorporate factual knowledge without compromising language understanding capability.

  • NLP Tasks: KEPLER achieved state-of-the-art results across several challenging datasets such as TACRED for relation classification, FewRel for few-shot learning, and OpenEntity for entity typing.
  • Knowledge Embedding Tasks: On tasks such as link prediction in knowledge graphs, KEPLER showed enhanced capability, especially in an inductive setting where unseen entities are involved.

The paper also introduces Wikidata5M, a large-scale knowledge graph dataset aligned with entity descriptions to serve as a comprehensive benchmark for testing such models.

Results and Implications

KEPLER’s performance indicates that joint optimization of KE and MLM objectives can enhance a PLM’s ability to recall factual knowledge while maintaining linguistic robustness. The integration of KE with PLMs opens up new possibilities for building models that efficiently leverage both structured and unstructured data.

Future Directions

The suggested future work includes:

  • Exploring more sophisticated KE methods to enhance KEPLER's knowledge representation capabilities without increasing complexity.
  • Developing better knowledge probing methodologies to accurately assess the model's knowledge retention and retrieval capabilities across diverse factual datasets.

In summary, KEPLER represents a significant step forward in the integration of linguistic and factual knowledge, providing a robust framework for applications requiring nuanced understanding from textual and structured data sources.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Xiaozhi Wang (51 papers)
  2. Tianyu Gao (35 papers)
  3. Zhaocheng Zhu (22 papers)
  4. Zhengyan Zhang (46 papers)
  5. Zhiyuan Liu (433 papers)
  6. Juanzi Li (144 papers)
  7. Jian Tang (326 papers)
Citations (607)