Approximate Summaries for Why and Why-not Provenance (Extended Version) (2002.00084v2)

Published 31 Jan 2020 in cs.DB

Abstract: Why and why-not provenance have been studied extensively in recent years. However, why-not provenance, and to a lesser degree why provenance, can be very large resulting in severe scalability and usability challenges. In this paper, we introduce a novel approximate summarization technique for provenance which overcomes these challenges. Our approach uses patterns to encode (why-not) provenance concisely. We develop techniques for efficiently computing provenance summaries balancing informativeness, conciseness, and completeness. To achieve scalability, we integrate sampling techniques into provenance capture and summarization. Our approach is the first to scale to large datasets and to generate comprehensive and meaningful summaries.

Citations (21)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Approximate Summaries for Why and Why-not Provenance (Extended Version) (2002.00084v2)

Summary

Related Papers