Knowledge Graph Completion via Complex Tensor Factorization (1702.06879v2)

Published 22 Feb 2017 in cs.AI, cs.LG, math.SP, and stat.ML

Abstract: In statistical relational learning, knowledge graph completion deals with automatically understanding the structure of large knowledge graphs---labeled directed graphs---and predicting missing relationships---labeled edges. State-of-the-art embedding models propose different trade-offs between modeling expressiveness, and time and space complexity. We reconcile both expressiveness and complexity through the use of complex-valued embeddings and explore the link between such complex-valued embeddings and unitary diagonalization. We corroborate our approach theoretically and show that all real square matrices---thus all possible relation/adjacency matrices---are the real part of some unitarily diagonalizable matrix. This results opens the door to a lot of other applications of square matrices factorization. Our approach based on complex embeddings is arguably simple, as it only involves a Hermitian dot product, the complex counterpart of the standard dot product between real vectors, whereas other methods resort to more and more complicated composition functions to increase their expressiveness. The proposed complex embeddings are scalable to large data sets as it remains linear in both space and time, while consistently outperforming alternative approaches on standard link prediction benchmarks.

Citations (277)

View on Semantic Scholar

Summary

The paper's main contribution is a complex tensor factorization model that uses complex embeddings to capture both symmetric and asymmetric relations efficiently.
The methodology leverages unitary diagonalization and a Hermitian dot product to ensure linear scalability in time and space while enhancing model expressiveness.
Empirical evaluations on datasets like FB15K and WN18 demonstrate improved accuracy over methods such as TransE, DistMult, and RESCAL.

Knowledge Graph Completion via Complex Tensor Factorization

The paper "Knowledge Graph Completion via Complex Tensor Factorization" presents a comprehensive approach to addressing the challenge of knowledge graph completion. This task involves predicting missing relationships within large-scale knowledge graphs, which are instrumental in numerous applications spanning from recommendation systems to question answering systems.

Theoretical Foundations and Methodology

The research underscores the growing importance of relational learning and offers a novel solution through complex embeddings to enhance both expressiveness and computational efficiency. Specifically, the authors propose using complex-valued embeddings and demonstrate how these embeddings can be naturally linked to unitary diagonalization, facilitating the representation of both symmetric and asymmetric relationships. The paper highlights that real square matrices can be represented as the real part of unitarily diagonalizable matrices, opening avenues for broader applications of matrix factorizations in relational learning.

The central proposition involves representing relations in knowledge graphs via complex tensor factorization. Traditional methods often struggle with balancing complexity, expressiveness, and scalability. The proposed approach leverages complex embeddings with the Hermitian dot product, achieving linear scalability in time and space while effectively modeling diverse relational properties. The use of complex embeddings permits capturing intricate patterns, including asymmetry, without resorting to overly complex composition functions.

Empirical Evaluation and Results

The empirical evaluations on both synthetic and real-world data sets reveal strong performance, with the model outperforming several state-of-the-art methodologies like TransE, DistMult, and RESCAL. One of the significant contributions is demonstrating that the use of Hermitian products within complex vector spaces enhances the capability of the model to distinguish between symmetric and antisymmetric relations—a challenge for many existing models. The results on standard benchmarks like FB15K and WN18 validate the theoretical claims, showing enhanced accuracy and scalability.

Implications and Future Directions

The complex tensor factorization model advances the field of knowledge graph completion by merging the realms of machine learning with complex algebra. This intersection opens new pathways for optimized representation learning frameworks capable of addressing the intricate nature of world knowledge. The implications extend beyond relational learning, suggesting applicability in areas where complex representations offer a mathematical advantage, potentially including signal processing and other domains reliant on complex datasets.

The authors suggest that future work could explore deeper integrations of complex linear algebra within machine learning frameworks, enhance models with better negative sampling strategies, and extend the methodology to encompass multi-relational and higher-dimensional data.

In conclusion, the paper provides a significant step towards reconciling model complexity with expressive power in knowledge graph completion tasks. By effectively utilizing complex embeddings, it sets a precedent for leveraging advanced mathematical tools to solve intricate machine learning challenges, with potential extensions and applications in diverse scientific and technological fields.

PDF Markdown