Papers
Topics
Authors
Recent
Search
2000 character limit reached

Mining CFD Rules on Big Data

Published 5 Aug 2018 in cs.DB | (1808.01621v1)

Abstract: Current conditional functional dependencies (CFDs) discovery algorithms always need a well-prepared training data set. This makes them difficult to be applied on large datasets which are always in low-quality. To handle the volume issue of big data, we develop the sampling algorithms to obtain a small representative training set. For the low-quality issue of big data, we then design the fault-tolerant rule discovery algorithm and the conflict resolution algorithm. We also propose parameter selection strategy for CFD discovery algorithm to ensure its effectiveness. Experimental results demonstrate that our method could discover effective CFD rules on billion-tuple data within reasonable time.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.