Papers
Topics
Authors
Recent
Search
2000 character limit reached

PRESTO: Probabilistic Cardinality Estimation for RDF Queries Based on Subgraph Overlapping

Published 19 Jan 2018 in cs.DB | (1801.06408v1)

Abstract: In query optimisation accurate cardinality estimation is essential for finding optimal query plans. It is especially challenging for RDF due to the lack of explicit schema and the excessive occurrence of joins in RDF queries. Existing approaches typically collect statistics based on the counts of triples and estimate the cardinality of a query as the product of its join components, where errors can accumulate even when the estimation of each component is accurate. As opposed to existing methods, we propose PRESTO, a cardinality estimation method that is based on the counts of subgraphs instead of triples and uses a probabilistic method to estimate cardinalities of RDF queries as a whole. PRESTO avoids some major issues of existing approaches and is able to accurately estimate arbitrary queries under a bound memory constraint. We evaluate PRESTO with YAGO and show that PRESTO is more accurate for both simple and complex queries.

Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.