Embedding Entities and Relations for Learning and Inference in Knowledge Bases (1412.6575v4)

Published 20 Dec 2014 in cs.CL

Abstract: We consider learning representations of entities and relations in KBs using the neural-embedding approach. We show that most existing models, including NTN (Socher et al., 2013) and TransE (Bordes et al., 2013b), can be generalized under a unified learning framework, where entities are low-dimensional vectors learned from a neural network and relations are bilinear and/or linear mapping functions. Under this framework, we compare a variety of embedding models on the link prediction task. We show that a simple bilinear formulation achieves new state-of-the-art results for the task (achieving a top-10 accuracy of 73.2% vs. 54.7% by TransE on Freebase). Furthermore, we introduce a novel approach that utilizes the learned relation embeddings to mine logical rules such as "BornInCity(a,b) and CityInCountry(b,c) => Nationality(a,c)". We find that embeddings learned from the bilinear objective are particularly good at capturing relational semantics and that the composition of relations is characterized by matrix multiplication. More interestingly, we demonstrate that our embedding-based rule extraction approach successfully outperforms a state-of-the-art confidence-based rule mining approach in mining Horn rules that involve compositional reasoning.

Citations (2,929)

View on Semantic Scholar

Summary

The paper introduces a unified framework that leverages energy-based neural embeddings with bilinear and linear transformations for improved KB representation.
It demonstrates that Bilinear-diag models achieve up to 73.2% HITS@10 on Freebase, significantly outperforming simpler methods like TransE.
It proposes an innovative embedding-based approach to extract compositional Horn rules, enhancing inference capabilities in knowledge bases.

Embedding Entities and Relations for Learning and Inference in Knowledge Bases

The subject of "Embedding Entities and Relations for Learning and Inference in Knowledge Bases" focuses on utilizing neural embedding techniques for representation learning in large-scale knowledge bases (KBs). The work aims to streamline and improve upon existing models under a unified framework, particularly scrutinizing the learning outcomes and inferential capacities of different embedding formats.

General Framework for Multi-Relational Representation Learning

This paper introduces a systematic framework for embedding entities and relations within KBs. The learning of representations is carried out through neural networks with energy-based objectives. Two fundamental transformations are central to this framework:

Bilinear Transformation: Captures the interrelations among entities by modeling each relation as a matrix.
Linear Transformation: Used for simpler, often more interpretable, linear mappings.

Next, the paper elaborates on different embedding models, from the complex Neural Tensor Network (NTN) to the simpler TransE and novel Bilinear-diag formulations. The comparative analysis across these models aims to reveal the differences in effectiveness tied to the complexity of their transformations.

Empirical Evaluation on Link Prediction

Link prediction, formulated as entity ranking, serves as the primary evaluation metric for testing these models. Utilizing datasets like WordNet and Freebase, different models are assessed using hits-at-10 (HITS@10) and mean reciprocal rank (MRR). The empirical results demonstrate that Bilinear and its variant Bilinear-diag not only outperform their peers but also show superior scalability and predictive performance. For instance, Bilinear-diag achieved an HITS@10 accuracy of 73.2% on Freebase, as opposed to 54.7% by TransE, establishing a new benchmark in link prediction.

Rule Extraction and Compositional Reasoning

An innovative contribution of this paper is the embedding-based approach to extracting compositional rules from KBs. By leveraging learned embeddings, the framework mines Horn rules that depict logical entailments, typically represented in the form: $B_1(a,b) \land B_2(b,c) \implies H(a,c)$

The procedure identifies and utilizes relational embeddings to approximate the compositionality inherent in relational data, outperforming state-of-the-art systems like AMIE in mining relevant logical rules. This effectiveness underscores the utility of embeddings in capturing the nuanced semantics of underlying KB structures.

Implications and Future Work

The contributions of this paper have significant theoretical and practical implications:

Practical Implications: The improved performance in link prediction and rule extraction can lead to more accurate and efficient inference mechanisms in KBs, benefiting applications in various domains such as information retrieval and natural language processing.
Theoretical Implications: By demonstrating the strength of bilinear models, particularly the diagonalized variant, the paper suggests avenues for further refinement in representation learning techniques. The compositional semantics captured through these embeddings showcase the potential for deeper reasoning capabilities.

Looking ahead, the potential integration of deep learning architectures could enhance the hierarchical and semantic understanding of multi-relational data. Furthermore, implementing tensor constructs could improve the flexibility and capacity of these models, paving the way for more sophisticated inference mechanisms.

Conclusion

This paper makes substantial advancements in embedding-based inference within KBs. The unified framework and empirical validations underscore the superiority of bilinear embeddings for tasks like link prediction and rule extraction. By adeptly capturing and leveraging relation semantics through efficient representations, the contributions hold promise for advancing both theoretical insights and practical applications within artificial intelligence and knowledge engineering.

PDF Markdown

Related Papers

YouTube

Show All Videos