Complex Query Answering with Neural Link Predictors (2011.03459v4)

Published 6 Nov 2020 in cs.LG, cs.AI, cs.LO, and cs.NE

Abstract: Neural link predictors are immensely useful for identifying missing edges in large scale Knowledge Graphs. However, it is still not clear how to use these models for answering more complex queries that arise in a number of domains, such as queries using logical conjunctions ($\land$), disjunctions ($\lor$) and existential quantifiers ($\exists$), while accounting for missing edges. In this work, we propose a framework for efficiently answering complex queries on incomplete Knowledge Graphs. We translate each query into an end-to-end differentiable objective, where the truth value of each atom is computed by a pre-trained neural link predictor. We then analyse two solutions to the optimisation problem, including gradient-based and combinatorial search. In our experiments, the proposed approach produces more accurate results than state-of-the-art methods -- black-box neural models trained on millions of generated queries -- without the need of training on a large and diverse set of complex queries. Using orders of magnitude less training data, we obtain relative improvements ranging from 8% up to 40% in Hits@3 across different knowledge graphs containing factual information. Finally, we demonstrate that it is possible to explain the outcome of our model in terms of the intermediate solutions identified for each of the complex query atoms. All our source code and datasets are available online, at https://github.com/uclnlp/cqd.

Citations (120)

View on Semantic Scholar

Summary

The paper proposes a novel Continuous Query Decomposition (CQD) framework that transforms complex first-order queries into differentiable optimization tasks using neural link predictors.
It employs t-norms and t-conorms with both gradient-based optimization (CQD-CO) and beam search (CQD-Beam) to handle logical conjunctions and disjunctions.
Empirical evaluations on standard benchmarks show up to 40% relative improvement over state-of-the-art models while reducing dependency on extensive training data.

An Overview of "Complex Query Answering with Neural Link Predictors"

The paper "Complex Query Answering with Neural Link Predictors" by Erik Arakelyan, Daniel Daza, Pasquale Minervini, and Michael Cochez addresses the challenge of answering complex queries on incomplete Knowledge Graphs (KGs) using neural link predictors. The authors propose a formal framework that efficiently translates complex First-Order Logical Queries into end-to-end differentiable objectives, using a pre-trained neural link predictor to compute the truth value of each query atom.

Knowledge Graphs have proven to be a flexible and versatile tool for storing relational data across diverse domains, ranging from general knowledge bases like DBpedia to domain-specific graphs in life sciences. Neural link predictors have been employed to predict missing connections within these graphs. However, the more intricate task of answering complex queries—comprising conjunctions, disjunctions, and existential quantifiers—remains an open challenge when dealing with incomplete data.

Technical Contributions

The core technical contribution of the paper is the development of a method to answer Existential Positive First-Order (EPFO) logical queries using neural networks, by translating complex queries into a mathematical optimization problem. The authors introduce Continuous Query Decomposition (CQD), a framework designed to maximize the likelihood of a query being true through either continuous or combinatorial optimization.

The CQD framework is distinguished by its use of t-norms and t-conorms for computationally managing logical conjunctions and disjunctions in a differentiable manner. Specifically, the authors illustrate two methods: CQD-CO, which uses gradient-based continuous optimization, and CQD-Beam, which employs a beam search strategy for combinatorial optimization. These methods enable the efficient handling of complex queries over drastically incomplete graphs by evaluating only a fraction of potential answers.

Empirical Evaluation

The empirical analysis performed by the authors utilizes standard datasets such as FB15k, FB15k-237, and NELL995. The experiments focus on a set of complex query types, including chained and intersecting predicate queries, using Hits@3 as a metric of performance. Results demonstrate that CQD-Beam outperforms state-of-the-art models such as Graph Query Embedding (GQE) and Query2Box (Q2B) with significantly less training data. CQD achieves up to 40% relative improvement, illustrating its capability and efficiency in answering complex queries with fewer resources.

Theoretical and Practical Implications

The theoretical underpinnings of CQD suggest a shift away from the necessity of large datasets of generated complex queries, which are typically needed to train other methods like Q2B. This presents a potential paradigm shift in query answering for neural models, emphasizing efficiency and reduced data dependency. The framework's transparency and explainability, contrasting with black-box neural models, provides an added layer of utility, allowing researchers to follow the reasoning process and inspect intermediate results.

From a practical perspective, the ability to handle complex queries with incomplete information offers improved applications in domains where KGs are often incomplete and dynamically evolving. This advancement opens new pathways for making sophisticated inferences in areas such as semantic search, recommendation systems, and automated knowledge discovery.

Future Outlook

The framework proposed by Arakelyan et al. may influence future developments in AI and knowledge-based systems, prompting further research into optimizing neural networks for logical query answering. Future developments might explore more intricate query types and further refine the efficiency of optimization techniques, as well as investigate the applicability of CQD across various domains with differing KG types and structures.

Overall, this paper represents a significant contribution to the task of complex query answering within knowledge graphs, providing a novel approach that capitalizes on the strengths of neural link predictors without the demand for extensive computational resources.

PDF Markdown

Related Papers

GitHub

GitHub - uclnlp/cqd: Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs (95 stars)

Tweets

https://twitter.com/PMinervini/status/1810355946385711337

https://twitter.com/PMinervini/status/1931286039248724385

https://twitter.com/PMinervini/status/1750484403749376425

YouTube

Show All Videos