Estimating the Cardinality of Conjunctive Queries over RDF Data Using Graph Summarisation (1801.09619v2)
Abstract: Estimating the cardinality (i.e., the number of answers) of conjunctive queries is particularly difficult in RDF systems: queries over RDF data are navigational and thus tend to involve many joins. We present a new, principled cardinality estimation technique based on graph summarisation. We interpret a summary of an RDF graph using a possible world semantics and formalise the estimation problem as computing the expected cardinality over all RDF graphs represented by the summary, and we present a closed-form formula for computing the expectation of arbitrary queries. We also discuss approaches to RDF graph summarisation. Finally, we show empirically that our cardinality technique is more accurate and more consistent, often by orders of magnitude, than the state of the art.
- Giorgio Stefanoni (6 papers)
- Boris Motik (23 papers)
- Egor V. Kostylev (13 papers)