Neurosymbolic AI for Reasoning over Knowledge Graphs: A Survey (2302.07200v3)

Published 14 Feb 2023 in cs.AI, cs.LO, and stat.ML

Abstract: Neurosymbolic AI is an increasingly active area of research that combines symbolic reasoning methods with deep learning to leverage their complementary benefits. As knowledge graphs are becoming a popular way to represent heterogeneous and multi-relational data, methods for reasoning on graph structures have attempted to follow this neurosymbolic paradigm. Traditionally, such approaches have utilized either rule-based inference or generated representative numerical embeddings from which patterns could be extracted. However, several recent studies have attempted to bridge this dichotomy to generate models that facilitate interpretability, maintain competitive performance, and integrate expert knowledge. Therefore, we survey methods that perform neurosymbolic reasoning tasks on knowledge graphs and propose a novel taxonomy by which we can classify them. Specifically, we propose three major categories: (1) logically-informed embedding approaches, (2) embedding approaches with logical constraints, and (3) rule learning approaches. Alongside the taxonomy, we provide a tabular overview of the approaches and links to their source code, if available, for more direct comparison. Finally, we discuss the unique characteristics and limitations of these methods, then propose several prospective directions toward which this field of research could evolve.

PDF Abstract

Neurosymbolic AI for Reasoning over Knowledge Graphs: A Survey

The paper "Neurosymbolic AI for Reasoning over Knowledge Graphs: A Survey" presents a comprehensive overview of the state-of-the-art in combining neurosymbolic AI techniques with knowledge graph reasoning. This integration seeks to leverage the strengths of both symbolic reasoning and deep learning to address the increasingly complex data structures represented by knowledge graphs.

Overview and Classification of Neurosymbolic Methods

Neurosymbolic AI attempts to merge the strengths of symbolic AI, which is known for interpretability and logical inference, with deep learning, which excels in pattern recognition from large datasets. Knowledge graphs (KGs), with their ability to represent multi-relational data in a structured format, are ideal candidates for applying neurosymbolic techniques. The paper introduces a novel taxonomy to differentiate neurosymbolic methods applied to KGs into three primary categories:

Logically-Informed Embedding Approaches: These methods begin by augmenting KGs with logical inference, subsequently deriving numerical embeddings to enhance prediction tasks. The paper highlights two subcategories: modular approaches with a clear two-step process, and iterative guidance methodologies where symbolic modules iteratively refine neural modules.
Learning with Logical Constraints: This class of methods imposes logical constraints directly onto the learning process of neural networks. The constraints guide the training towards adhering to known logical relationships inherent in the data, ultimately helping to fuse symbolic domain knowledge into the embedding space or the prediction process.
Rule Learning for Knowledge Graph Completion: Unlike leveraging predefined logical rules, these approaches aim to dynamically learn logical rules and their confidences from KGs. The paper identifies techniques using the Expectation-Maximization (EM) algorithm as central to these methodologies, highlighting both rule weight learning and rule mining approaches.

Implications and Future Directions

The integration of symbolic reasoning in the process of neural network training allows for enhanced interpretability, enabling the generation of human-readable explanations for the decisions made by AI models. This is particularly impactful in domains requiring transparent decision-making frameworks, such as biomedical research, autonomous systems, and social network analysis.

The survey also draws attention to various limitations currently faced by neurosymbolic approaches, such as increased model complexity when integrating symbolic inference processes. Additionally, challenges related to scalability and the need for high-quality domain-specific rules or ontologies are prevalent. Nevertheless, the potential benefits, including the ability to incorporate domain knowledge directly into learning algorithms and improve predictions by modeling dependencies in the data, highlight exciting future avenues for exploration.

Prospectively, neurosymbolic approaches promise to contribute profoundly to several application areas by enabling more accurate predictions from incomplete KGs, providing solutions for managing multimodal data, and improving few-shot learning through enhanced data representability.

In summary, this survey underscores the growing importance of neurosymbolic AI in advancing knowledge graph reasoning. By classifying existing approaches and charting future paths, this work paves the way for further research into how symbolic and neural methodologies can be synergistically applied to complex structured data, potentially transforming reasoning accuracy and interpretability in knowledge-rich domains.