SimplE Embedding for Link Prediction in Knowledge Graphs (1802.04868v2)

Published 13 Feb 2018 in stat.ML and cs.LG

Abstract: Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction problems. Proposed in 1927, Canonical Polyadic (CP) decomposition is among the first tensor factorization approaches. CP generally performs poorly for link prediction as it learns two independent embedding vectors for each entity, whereas they are really tied. We present a simple enhancement of CP (which we call SimplE) to allow the two embeddings of each entity to be learned dependently. The complexity of SimplE grows linearly with the size of embeddings. The embeddings learned through SimplE are interpretable, and certain types of background knowledge can be incorporated into these embeddings through weight tying. We prove SimplE is fully expressive and derive a bound on the size of its embeddings for full expressivity. We show empirically that, despite its simplicity, SimplE outperforms several state-of-the-art tensor factorization techniques. SimplE's code is available on GitHub at https://github.com/Mehran-k/SimplE.

Citations (656)

View on Semantic Scholar

Summary

The paper introduces a novel tensor factorization approach, SimplE, that interdependently learns head and tail entity embeddings for improved link prediction.
The model is fully expressive and shows competitive performance with high MRR and hit@k metrics on standard benchmarks like WN18 and FB15k.
SimplE’s efficient bilinear framework supports scalable computations and naturally incorporates background knowledge for enhanced reasoning.

SimplE Embedding for Link Prediction in Knowledge Graphs

The paper "SimplE Embedding for Link Prediction in Knowledge Graphs" by Seyed Mehran Kazemi and David Poole introduces a novel tensor factorization approach, called SimplE, for enhancing link prediction tasks in knowledge graphs (KGs). This approach intends to address key limitations in the existing Canonical Polyadic (CP) decomposition and exhibits high empirical performance in comparison to state-of-the-art models.

Key Contributions and Methodology

The paper primarily focuses on improving CP decomposition, which suffers notably due to its independent treatment of head and tail entity embeddings. SimplE proposes a mechanism to interdependently learn these embeddings, thereby enhancing the expressiveness and interpretability of KG embeddings.

Model Definition: SimplE considers each entity to have two embeddings, resembling CP but linking them via two vectors for each relation, thus allowing the embeddings to be learned dependently.
Expressiveness: The authors demonstrate theoretically that SimplE is fully expressive, meaning it can model any ground truth over triples in KGs by manipulating the size of the embeddings up to a minimal bound of entity and relation counts.
Bilinear Framework: By positioning SimplE within the bilinear models' family, it draws parallels with other approaches like DistMult and ComplEx. Unlike these models, SimplE introduces constraints that allow a reduction in computational redundancy seen in similar frameworks, such as in ComplEx.
Incorporating Background Knowledge: SimplE's architecture naturally allows for the encoding of background knowledge through parameter tying, which is a major advantage over other methods that require post-processing or additional penalty terms in the optimization.
Time Complexity: The model maintains linear complexity concerning the embedding size, aligning with the requirements for scalability to large-scale KGs.

Experimental Evaluations

The experimental comparison demonstrates that SimplE achieves competitive or better performance on standard benchmarks WN18 and FB15k. Notably, SimplE shows outstanding results in terms of filtered mean reciprocal rank (MRR) and hit@k metrics, especially excelling in FB15k for filtered MRR and hit@1.

Implications and Future Directions

The paper opens discussions on incorporating richer forms of background knowledge directly into model design without compromising the simplicity and efficiency of embeddings. This could also lead to new strategies in ensemble learning or hybrid models combining logic-based approaches with statistical methods, potentially facilitating more comprehensive reasoning over KGs.

Future Developments:

Exploring analogical structures and ensemble methods to bolster prediction accuracy.
Integration with logic-based frameworks for enhanced property reasoning.
Adapting SimplE in multi-task and transfer learning scenarios to leverage knowledge across related domains.

In summary, SimplE provides a significant advancement in the field of link prediction, effectively balancing expressiveness, computational efficiency, and practical applicability, thus holding promising prospects for future AI developments in knowledge representation and reasoning systems.

PDF Markdown

Related Papers

YouTube

Show All Videos