Papers
Topics
Authors
Recent
Search
2000 character limit reached

Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion

Published 7 Oct 2020 in cs.CL | (2010.03548v2)

Abstract: A case-based reasoning (CBR) system solves a new problem by retrieving `cases' that are similar to the given problem. If such a system can achieve high accuracy, it is appealing owing to its simplicity, interpretability, and scalability. In this paper, we demonstrate that such a system is achievable for reasoning in knowledge-bases (KBs). Our approach predicts attributes for an entity by gathering reasoning paths from similar entities in the KB. Our probabilistic model estimates the likelihood that a path is effective at answering a query about the given entity. The parameters of our model can be efficiently computed using simple path statistics and require no iterative optimization. Our model is non-parametric, growing dynamically as new entities and relations are added to the KB. On several benchmark datasets our approach significantly outperforms other rule learning approaches and performs comparably to state-of-the-art embedding-based approaches. Furthermore, we demonstrate the effectiveness of our model in an "open-world" setting where new entities arrive in an online fashion, significantly outperforming state-of-the-art approaches and nearly matching the best offline method. Code available at https://github.com/ameyagodbole/Prob-CBR

Citations (11)

Summary

  • The paper proposes a non-parametric, case-based reasoning approach that dynamically retrieves paths from similar entities to complete evolving knowledge graphs.
  • It leverages probabilistic weighting to assess path frequency and precision, achieving competitive performance on benchmark datasets.
  • The model employs hierarchical clustering and efficient online updates, eliminating the need for extensive retraining in open-world settings.

Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion

The paper "Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion" (2010.03548) introduces a non-parametric model leveraging case-based reasoning (CBR) for Knowledge Graph (KG) completion, particularly focused on handling evolving KGs where new entities and relations continuously emerge. The model dynamically gathers paths from similar entities in the KG to answer queries, emphasizing the importance of probabilistic weighting to improve precision and adaptability in open-world settings.

Model Overview and Methodology

Case-based Reasoning Approach

The proposed approach applies a kk-nearest neighbor (KNN)-like mechanism, akin to CBR systems in classical AI, to accommodate dynamic and open-world KGs. Given a query, the model retrieves similar entities and identifies reasoning paths between these entities and the target relations. This process involves two key probabilistic estimations: the path's prior (frequency) and its precision (likelihood to correctly resolve a query). Figure 1

Figure 1: Given the query, (Jon Von Neumann,~place_of_death,~?), the model gathers reasoning paths from similar entities, emphasizing the importance of learning path weights for clusters of similar entities.

Non-parametric Learning

The model is non-parametric, meaning it grows in complexity as the KG expands, without needing extensive retraining. The parameters are derived from path statistics within clusters of similar entities, enabling scalability and efficiency. The clusters are formed using hierarchical agglomerative clustering, with parameter estimates computed through simple count statistics rather than iterative optimization. Figure 2

Figure 2: Illustration of the non-parametric model's ability to handle newly added entities and infer new facts without requiring expensive training.

Implementation Details

Path Statistics and Clustering

For accurate path relevance estimation, entities are clustered, and path statistics are computed for each cluster. This clustering helps mitigate parameter explosion and noise due to sparsity. This method efficiently integrates with an online setting, leveraging Grinch – an online hierarchical clustering algorithm – to adapt the clustering dynamically as data evolves.

Handling Open-world Dynamics

The model effectively manages open-world scenarios by:

  1. Dynamically updating entity representations using sparse vectors of edge types.
  2. Applying fast and efficient parameter updates only for affected entities using path length cycles.

Experimental Results

Knowledge Base Completion

The model demonstrates competitive performance on traditional KBC tasks, significantly outperforming other rule-based and embedding-based models on several benchmark datasets like NELL-995, FB122, and WN18RR. Specifically, the model achieves superior results on tasks involving logical rule application, highlighting its path weighting efficacy. Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3: Performance results indicate the model surpasses other approaches in an open-world setting using 10-30% of seen edges.

Open-world Settings

Remarkably, in streaming data conditions, the model efficiently bridges the performance gap with the best offline methods, substantially outperforming parametric models such as RotatE, which struggle with model forgetting issues as new data is added. Figure 4

Figure 4: Number of entities added to KB in each batch and number of entities modified in each update, showcasing the model's efficiency in incremental updates.

Conclusion

The presented approach represents a simple yet effective method for KG completion, aligning with open-world requirements by dynamically learning from nearest entities and efficiently updating parameters. The model's significant advantage lies in its ability to deliver close to state-of-the-art performance without relying on extensive retraining, positioning it as a viable solution for handling ever-evolving KGs. Future research could refine the clustering process or expand on path types to further boost performance in more diverse KG scenarios.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

GitHub