Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

From Discrimination to Generation: Knowledge Graph Completion with Generative Transformer (2202.02113v7)

Published 4 Feb 2022 in cs.CL, cs.AI, cs.DB, cs.IR, and cs.LG

Abstract: Knowledge graph completion aims to address the problem of extending a KG with missing triples. In this paper, we provide an approach GenKGC, which converts knowledge graph completion to sequence-to-sequence generation task with the pre-trained LLM. We further introduce relation-guided demonstration and entity-aware hierarchical decoding for better representation learning and fast inference. Experimental results on three datasets show that our approach can obtain better or comparable performance than baselines and achieve faster inference speed compared with previous methods with pre-trained LLMs. We also release a new large-scale Chinese knowledge graph dataset AliopenKG500 for research purpose. Code and datasets are available in https://github.com/zjunlp/PromptKG/tree/main/GenKGC.

Knowledge Graph Completion with GenKGC: A Generative Approach

The paper "From Discrimination to Generation: Knowledge Graph Completion with Generative Transformer" presents GenKGC, a novel methodology for knowledge graph completion (KGC) that leverages a generative approach via sequence-to-sequence (seq2seq) models. This approach aims to surpass traditional discriminative techniques which rely heavily on pre-defined scoring functions and expensive negative sampling.

Methodological Overview

The authors propose the GenKGC framework which models the KGC task as a seq2seq generation problem, utilizing a pre-trained LLM, BART. Entities and relations are represented as sequences, allowing the model to predict missing triples by generating target entities as output sequences. This generative approach contrasts with prior discriminative models such as TransE, ComplEx, and RotatE, which embed entities and relations in vector spaces and score potential triples based on geometric operations.

Innovative Framework Components

  1. Relation-Guided Demonstration: Drawing inspiration from prompt-based learning, the paper introduces relation-guided demonstrations. By incorporating triples with similar relations into the input sequence, the model can improve few-shot performance and enhance relational learning.
  2. Entity-Aware Hierarchical Decoding: To address the inefficiency in scoring all entity candidates, the authors implement a beam search with entity-aware hierarchical constraints. By using a prefix tree and type-driven constraints, the decoding process is optimized, offering significant reductions in inference time.

Experimental Results

The framework was empirically validated on multiple datasets, including WN18RR, FB15k-237, and a newly introduced large-scale dataset, OpenBG500.

  • Performance Metrics: GenKGC achieved comparable performance to existing models while demonstrating a notable reduction in inference time. Particularly, inference for OpenBG500, which has 269,658 entities, showed drastic time improvements over traditional methods such as KG-BERT.
  • Efficiency: The approach reduces memory constraints and computational demands typically associated with large-scale knowledge graphs, making it viable for real-world applications.

Implications and Future Directions

The GenKGC framework signifies a promising shift towards generative models within the KGC domain, highlighting:

  • Practical Implications: Enhanced efficiency and scalability of KGC processes, crucial for industrial applications where large datasets and rapid inference are paramount.
  • Theoretical Implications: Encouragement for further exploration of generative paradigms in knowledge representation tasks, potentially uncovering new insights and efficiencies by exploiting the seq2seq capabilities.

The paper anticipates future developments in modeling entity relationships more finely and exploring additional mechanisms for refining hierarchical decoding processes. As AI continues to evolve, the integration of generative models in knowledge-intensive domains remains an attractive frontier, poised to yield further advancements and innovations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Xin Xie (81 papers)
  2. Ningyu Zhang (148 papers)
  3. Zhoubo Li (6 papers)
  4. Shumin Deng (65 papers)
  5. Hui Chen (298 papers)
  6. Feiyu Xiong (53 papers)
  7. Mosha Chen (17 papers)
  8. Huajun Chen (198 papers)
Citations (71)
X Twitter Logo Streamline Icon: https://streamlinehq.com