Papers
Topics
Authors
Recent
Search
2000 character limit reached

Evaluating Datalog over Semirings: A Grounding-based Approach

Published 19 Mar 2024 in cs.DB | (2403.12436v1)

Abstract: Datalog is a powerful yet elegant language that allows expressing recursive computation. Although Datalog evaluation has been extensively studied in the literature, so far, only loose upper bounds are known on how fast a Datalog program can be evaluated. In this work, we ask the following question: given a Datalog program over a naturally-ordered semiring $\sigma$, what is the tightest possible runtime? To this end, our main contribution is a general two-phase framework for analyzing the data complexity of Datalog over $\sigma$: first ground the program into an equivalent system of polynomial equations (i.e. grounding) and then find the least fixpoint of the grounding over $\sigma$. We present algorithms that use structure-aware query evaluation techniques to obtain the smallest possible groundings. Next, efficient algorithms for fixpoint evaluation are introduced over two classes of semirings: (1) finite-rank semirings and (2) absorptive semirings of total order. Combining both phases, we obtain state-of-the-art and new algorithmic results. Finally, we complement our results with a matching fine-grained lower bound.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Foundations of databases. Vol. 8. Addison-Wesley Reading.
  2. FAQ: Questions Asked Frequently. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS ’16). New York, NY, USA.
  3. Foto Afrati and Christos H Papadimitriou. 1993. The parallel complexity of simple logic programs. Journal of the ACM (JACM) 40, 4 (1993), 891–916.
  4. Lars Ole Andersen. 1994. Program analysis and specialization for the C programming language. Ph. D. Dissertation. Citeseer.
  5. Magic sets and other strange ways to implement logic programs. In Proceedings of the fifth ACM SIGACT-SIGMOD symposium on Principles of database systems. 1–15.
  6. Pablo Barceló Baeza. 2013. Querying graph databases. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI symposium on Principles of database systems. 175–188.
  7. Efficiently enumerating minimal triangulations. Discret. Appl. Math. 303 (2021), 216–236.
  8. Nofar Carmeli and Markus Kröll. 2021. On the Enumeration Complexity of Unions of Conjunctive Queries. ACM Transactions on Database Systems (TODS) 46, 2 (2021), 1–41.
  9. Katrin Casel and Markus L. Schmid. 2021. Fine-Grained Complexity of Regular Path Queries. In ICDT (LIPIcs, Vol. 186). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 19:1–19:20.
  10. Introduction to algorithms. MIT press.
  11. Stavros Cosmadakis. 1999. Inherent complexity of recursive queries. In Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 148–154.
  12. Bruno Courcelle. 1990. Graph rewriting: An algebraic and logic approach. In Formal Models and Semantics. Elsevier, 193–242.
  13. Brian A Davey and Hilary A Priestley. 2002. Introduction to lattices and order. Cambridge university press.
  14. Circuits for Datalog Provenance.. In ICDT, Vol. 3. Citeseer, 2014.
  15. Edsger W Dijkstra. 2022. A note on two problems in connexion with graphs. In Edsger Wybe Dijkstra: His Life, Work, and Legacy. 287–290.
  16. Newtonian Program Analysis. J. ACM 57, 6, Article 33 (nov 2010), 47 pages.
  17. Robert W Floyd. 1962. Algorithm 97: shortest path. Commun. ACM 5, 6 (1962), 345.
  18. Query evaluation via tree-decompositions. Journal of the ACM (JACM) 49, 6 (2002), 716–752.
  19. Annotated XML: Queries and Provenance. In PODS. ACM, 271–280.
  20. Michael R Garey. 1997. Computers and intractability: A guide to the theory of np-completeness, freeman. Fundamental (1997).
  21. Datalog LITE: A deductive query language with linear time model checking. ACM Transactions on Computational Logic (TOCL) 3, 1 (2002), 42–79.
  22. Georg Gottlob and Christoph Koch. 2004. Monadic datalog and the expressive power of languages for Web information extraction. J. ACM 51, 1 (2004), 74–113.
  23. Datalog and Recursive Query Processing. Found. Trends Databases 5, 2 (2013), 105–195.
  24. Provenance semirings. In Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 31–40.
  25. On the Convergence Rate of Linear Datalogo over Stable Semirings. arXiv preprint arXiv:2311.17664 (2023).
  26. Manas Joglekar and Christopher Ré. 2018. It’s All a Matter of Degree - Using Degree Information to Optimize Multiway Joins. Theory Comput. Syst. 62, 4 (2018), 810–853.
  27. Stasys Jukna. 2015. Lower Bounds for Tropical Circuits and Dynamic Programs. Theory Comput. Syst. 57, 1 (2015), 160–194.
  28. Convergence of Datalog over (Pre-) Semirings. In PODS. ACM, 105–117.
  29. What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?. In PODS. ACM, 429–444.
  30. Donald E Knuth. 1977. A generalization of Dijkstra’s algorithm. Inform. Process. Lett. 6, 1 (1977), 1–5.
  31. Fast graph simplification for interleaved Dyck-reachability. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 780–793.
  32. On the complexity of bidirected interleaved Dyck-reachability. Proceedings of the ACM on Programming Languages 5, POPL (2021), 1–28.
  33. Tight Hardness for Shortest Cycles and Paths in Sparse Graphs. In SODA. SIAM, 1236–1252.
  34. Carsten Lutz and Marcin Przybylko. 2022. Efficiently enumerating answers to ontology-mediated queries. In Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 277–289.
  35. Dániel Marx. 2010. Can You Beat Treewidth? Theory Comput. 6, 1 (2010), 85–112.
  36. Anders Alnor Mathiasen and Andreas Pavlogiannis. 2021. The fine-grained and parallel complexity of andersen’s pointer analysis. Proc. ACM Program. Lang. 5, POPL (2021), 1–29.
  37. Worst-case Optimal Join Algorithms. J. ACM 65, 3 (2018), 16:1–16:40.
  38. Skew Strikes Back: New Developments in the Theory of Join Algorithms. SIGMOD Rec. (2014).
  39. A Practical Dynamic Programming Approach to Datalog Provenance Computation. CoRR abs/2112.01132 (2021).
  40. Provenance-Based Algorithms for Rich Queries over Graph Databases. In EDBT. OpenProceedings.org, 73–84.
  41. Thomas W. Reps. 1998. Program analysis via graph reachability. Inf. Softw. Technol. 40, 11-12 (1998), 701–726.
  42. On fast large-scale program analysis in Datalog. In CC. ACM, 196–206.
  43. Distributed SociaLite: A Datalog-Based Language for Large-Scale Graph Analysis. Proc. VLDB Endow. 6, 14 (2013), 1906–1917. https://doi.org/10.14778/2556549.2556572
  44. Yannis Smaragdakis and George Balatsouras. 2015. Pointer Analysis. Found. Trends Program. Lang. 2, 1 (2015), 1–69.
  45. Jeffrey D. Ullman and Allen Van Gelder. 1988. Parallel Complexity of Logical Query Programs. Algorithmica 3 (1988), 5–42.
  46. Leslie Valiant. 1974. General context-free recognition in less than cubic time. (1974).
  47. Moshe Y Vardi. 1982. The complexity of relational query languages. In Proceedings of the fourteenth annual ACM symposium on Theory of computing. 137–146.
  48. Yilei Wang and Ke Yi. 2021. Secure Yannakakis: Join-Aggregate Queries over Private Data. In SIGMOD Conference. ACM, 1969–1981.
  49. Stephen Warshall. 1962. A theorem on boolean matrices. Journal of the ACM (JACM) 9, 1 (1962), 11–12.
  50. Mihalis Yannakakis. 1990. Graph-Theoretic Methods in Database Theory. In PODS. ACM Press, 230–242.
  51. Clement Tak Yu and Meral Z Ozsoyoglu. 1979. An algorithm for tree-query membership of a distributed query. In COMPSAC 79. Proceedings. Computer Software and The IEEE Computer Society’s Third International Applications Conference, 1979. IEEE, 306–312.
Citations (3)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 41 likes about this paper.