TorusE: Knowledge Graph Embedding on a Lie Group (1711.05435v1)

Published 15 Nov 2017 in cs.AI

Abstract: Knowledge graphs are useful for many AI tasks. However, knowledge graphs often have missing facts. To populate the graphs, knowledge graph embedding models have been developed. Knowledge graph embedding models map entities and relations in a knowledge graph to a vector space and predict unknown triples by scoring candidate triples. TransE is the first translation-based method and it is well known because of its simplicity and efficiency for knowledge graph completion. It employs the principle that the differences between entity embeddings represent their relations. The principle seems very simple, but it can effectively capture the rules of a knowledge graph. However, TransE has a problem with its regularization. TransE forces entity embeddings to be on a sphere in the embedding vector space. This regularization warps the embeddings and makes it difficult for them to fulfill the abovementioned principle. The regularization also affects adversely the accuracies of the link predictions. On the other hand, regularization is important because entity embeddings diverge by negative sampling without it. This paper proposes a novel embedding model, TorusE, to solve the regularization problem. The principle of TransE can be defined on any Lie group. A torus, which is one of the compact Lie groups, can be chosen for the embedding space to avoid regularization. To the best of our knowledge, TorusE is the first model that embeds objects on other than a real or complex vector space, and this paper is the first to formally discuss the problem of regularization of TransE. Our approach outperforms other state-of-the-art approaches such as TransE, DistMult and ComplEx on a standard link prediction task. We show that TorusE is scalable to large-size knowledge graphs and is faster than the original TransE.

Citations (196)

View on Semantic Scholar

Summary

The paper introduces TorusE, a novel knowledge graph embedding model that maps entities and relations onto a torus to inherently avoid the regularization problems encountered by models like TransE.
TorusE demonstrates superior performance over several state-of-the-art models on benchmark datasets WN18 and FB15K, showing significant improvements in metrics such as HITS@1 without the need for explicit normalization.
By leveraging the properties of a compact Lie group, TorusE offers a robust and scalable alternative to traditional embedding methods and suggests exploring other non-Euclidean spaces for knowledge graph representation.

Knowledge Graph Embedding on a Lie Group: A Study of TorusE

The paper "TorusE: Knowledge Graph Embedding on a Lie Group" introduces a novel model named TorusE for knowledge graph embedding, leveraging the properties of a Lie group to address the regularization issues inherent in previous methods such as TransE. Authored by Takuma Ebisu and Ryutaro Ichise, the work explores the limitations of the TransE model and proposes an innovative embedding space that promises enhanced performance and scalability.

Background and Motivation

Knowledge graphs, representing facts as triples (head, relation, tail), are critical for AI tasks like question answering and fact-checking. However, these graphs often have missing entries, which necessitate the development of models like TransE for link prediction to infer unknown triples. TransE’s principle is based on translating the embedding of entities in relation to a simple vector addition operation. Despite its simplicity and efficiency, TransE has been criticized for its regularization approach, which restricts the embedding space to a sphere, leading to warped embeddings that can degrade prediction accuracy.

TorusE: A Novel Approach

TorusE addresses the regularization flaw of TransE by embedding knowledge graph entities and relations on the surface of a torus, a compact Lie group. The choice of a torus as the embedding space eliminates the need for regularization, as the compactness of the space inherently prevents embeddings from diverging. This approach allows TorusE to maintain the concise principle of TransE without the associated drawbacks, ultimately preserving the integrity of the translated triples.

Methodology

The methodology involves representing entities and relations as vectors on a torus, where operations such as addition and subtraction are well-defined and smooth. The scoring function in TorusE assesses the likelihood of a triple through distance measures that are naturally bounded on a torus. The feasibility of calculating such distances efficiently and accurately is demonstrated in the paper, showing that TorusE has a computational complexity similar to TransE but without the overhead of normalization.

Experimental Results

The paper’s experimental section reveals that TorusE outperforms several state-of-the-art models, including TransE, DistMult, and ComplEx, on benchmark datasets WN18 and FB15K. The results highlight significant improvements in metrics, particularly HITS@1, where TorusE shows a marked increase in accuracy, illustrating its capability to handle complex relational structures without regularization. Additionally, the experiments underscore TorusE’s superior scalability, demonstrated by faster training times at higher dimensions compared to TransE.

Implications and Future Work

The implications of TorusE are substantial, as it provides a robust alternative to popular embedding strategies, circumventing issues related to regularization without compromising on efficiency or simplicity. This advancement signals potential shifts in embedding practices, encouraging the exploration of alternative mathematical structures beyond traditional vector spaces.

Future research directions include exploring other compact Lie groups for embedding spaces and integrating TorusE with more complex models that utilize additional information beyond triples. Such explorations could further enhance the generalization capabilities of knowledge graph completion models, supporting a broader range of AI applications.

In conclusion, the introduction of TorusE demonstrates the value of applying advanced mathematical concepts, such as Lie groups, to tackle longstanding challenges in AI model design. This approach not only resolves specific issues of regularization but also opens new avenues for research in knowledge graph embeddings and their applications.

PDF Markdown