Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An epistemic approach to model uncertainty in data-graphs (2109.14112v2)

Published 29 Sep 2021 in cs.DB and cs.AI

Abstract: Graph databases are becoming widely successful as data models that allow to effectively represent and process complex relationships among various types of data. As with any other type of data repository, graph databases may suffer from errors and discrepancies with respect to the real-world data they intend to represent. In this work we explore the notion of probabilistic unclean graph databases, previously proposed for relational databases, in order to capture the idea that the observed (unclean) graph database is actually the noisy version of a clean one that correctly models the world but that we know partially. As the factors that may be involved in the observation can be many, e.g, all different types of clerical errors or unintended transformations of the data, we assume a probabilistic model that describes the distribution over all possible ways in which the clean (uncertain) database could have been polluted. Based on this model we define two computational problems: data cleaning and probabilistic query answering and study for both of them their corresponding complexity when considering that the transformation of the database can be caused by either removing (subset) or adding (superset) nodes and edges.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Sergio Abriola (7 papers)
  2. Santiago Cifuentes (9 papers)
  3. Nina Pardal (17 papers)
  4. Edwin Pin (6 papers)
  5. María Vanina Martínez (3 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.