K-BERT: Enabling Language Representation with Knowledge Graph (1909.07606v1)

Published 17 Sep 2019 in cs.CL and cs.LG

Abstract: Pre-trained language representation models, such as BERT, capture a general language representation from large-scale corpora, but lack domain-specific knowledge. When reading a domain text, experts make inferences with relevant knowledge. For machines to achieve this capability, we propose a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge. However, too much knowledge incorporation may divert the sentence from its correct meaning, which is called knowledge noise (KN) issue. To overcome KN, K-BERT introduces soft-position and visible matrix to limit the impact of knowledge. K-BERT can easily inject domain knowledge into the models by equipped with a KG without pre-training by-self because it is capable of loading model parameters from the pre-trained BERT. Our investigation reveals promising results in twelve NLP tasks. Especially in domain-specific tasks (including finance, law, and medicine), K-BERT significantly outperforms BERT, which demonstrates that K-BERT is an excellent choice for solving the knowledge-driven problems that require experts.

PDF Abstract

K-BERT: Enabling Language Representation with Knowledge Graph

The paper "K-BERT: Enabling Language Representation with Knowledge Graph" introduces a novel approach to enhancing the capabilities of pre-trained LLMs by incorporating domain-specific knowledge through the integration of Knowledge Graphs (KGs). The authors propose a model, K-BERT, which aims to address the limitations of general pre-trained models, such as BERT, in domain-specific tasks.

Key Contributions

Knowledge-Enabled Language Representation Model: K-BERT is designed to integrate domain-specific knowledge into pre-trained LLMs. The proposed method introduces knowledge triples from KGs directly into the input sentences, transforming them into knowledge-rich sentence trees.
Mitigating Knowledge Noise: The authors identify a crucial challenge known as Knowledge Noise (KN), where excessive incorporation of knowledge can distort the original meaning of the sentences. To address this, K-BERT utilizes a visible matrix and soft-position embeddings to control the influence of the injected knowledge, ensuring that the structural integrity of the original sentence is preserved.
Compatibility with Pre-trained Models: K-BERT maintains parameter compatibility with existing pre-trained BERT models. This design choice allows K-BERT to utilize pre-existing BERT parameters, thereby eliminating the need for pre-training from scratch, offering a significant advantage in computational efficiency and practicality.

Experimental Evaluation

The performance of K-BERT was rigorously evaluated on twelve Chinese NLP tasks, spanning both open-domain and specific-domain contexts. Notably, K-BERT demonstrated marked improvements in domain-specific tasks, outperforming BERT significantly in areas requiring specialized knowledge, such as finance, law, and medicine.

Numerical Results

The empirical results substantiate the effectiveness of K-BERT. For instance, in specific-domain tasks like Finance_NER and Medicine_NER, K-BERT achieved notable enhancements over BERT. The F1 scores on Medicine_NER improved from 92.5% with BERT to 94.2% with K-BERT when utilizing the MedicalKG. This improvement underscores the benefits of incorporating precise domain knowledge into LLMs.

Practical and Theoretical Implications

The practical implications of this research are quite substantial. K-BERT offers a feasible approach for deploying refined LLMs in specialized domains without the overhead of extensive re-training. This aspect is particularly advantageous for applications with limited computational resources.

Theoretically, K-BERT's introduction of the visible matrix and soft-position embeddings presents a novel method for addressing the KN issue, potentially paving the way for more sophisticated integration techniques that balance the incorporation of knowledge without compromising sentence integrity.

Future Directions

The positive results obtained using K-BERT suggest several promising avenues for future research:

Optimization of Knowledge Query (K-Query): Enhancing the K-Query mechanism to filter and prioritize the most contextually relevant triples could further refine the model's understanding and utility.
Extension to Other LLMs: Exploring the integration of KGs with other language representation models such as ELMo and XLNet could yield further insights and broader applicability across diverse NLP tasks.

In conclusion, the introduction of K-BERT represents an important step toward the development of more knowledgeable and context-aware LLMs, with significant potential impacts across various specialized fields.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Weijie Liu (33 papers)
Peng Zhou (136 papers)
Zhe Zhao (97 papers)
Zhiruo Wang (18 papers)
Qi Ju (20 papers)
Haotang Deng (3 papers)
Ping Wang (288 papers)

Citations (722)

View on Semantic Scholar

K-BERT: Enabling Language Representation with Knowledge Graph (1909.07606v1)