Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Spanning Tree-based Query Plan Enumeration (2403.04026v1)

Published 6 Mar 2024 in cs.DB

Abstract: In this work, we define the problem of finding an optimal query plan as finding spanning trees with low costs. This approach empowers the utilization of a series of spanning tree algorithms, thereby enabling systematic exploration of the plan search space over a join graph. Capitalizing on the polynomial time complexity of spanning tree algorithms, we present the Ensemble Spanning Tree Enumeration (ESTE) strategy. ESTE employs two conventional spanning tree algorithms, Prim's and Kruskal's, together to enhance the robustness of the query optimizer. In ESTE, multiple query plans are enumerated exploring different areas of the search space. This positions ESTE as an intermediate strategy between exhaustive and heuristic enumeration strategies. We show that ESTE is more robust in identifying efficient query plans for large queries. In the case of data modifications and workload demand increase, we believe that our approach can be a cheaper alternative to maintain optimizer robustness by integrating additional spanning tree algorithms rather than completely changing the optimizer to another plan enumeration algorithm. The experimental evaluation shows that ESTE achieves better consistency in plan quality and optimization time than existing solutions while identifying similarly optimal plans.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Jörg Arndt. Matters Computational: Ideas, Algorithms, Source Code. Springer Science and Business Media, 2010.
  2. Peter Boncz. The IMDB Dataset. http://homepages.cwi.nl/~boncz/job/imdb.tgz.
  3. Polynomial Heuristics for Query Optimization. In ICDE, pages 589–600, 2010.
  4. Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities. In SIGMOD, pages 18–35, 2019.
  5. Introduction to Algorithms. MIT press, 2022.
  6. Optimal Top-Down Join Enumeration. In SIGMOD, pages 785–796, 2007.
  7. Leonidas Fegaras. A New Heuristic for Optimizing Large Queries. In DEXA, pages 726–735, 1998.
  8. A New, Highly Efficient, and Easy to Implement Top-Down Join Enumeration Algorithm. In ICDE, pages 864–875, 2011.
  9. Efficiently Computing Join Orders with Heuristic Search. SIGMOD, 1(1), 2023.
  10. On the optimal nesting order for computing n-relational joins. ACM Transactions on Database Systems, 9(3):482–502, 1984.
  11. On the Optimal Nesting Order for Computing N-relational Joins. TODS, 9(3):482–502, 1984.
  12. On the Propagation of Errors in the Size of Join Results. SIGMOD Record, 20(2):268–277, 1991.
  13. Randomized Algorithms for Optimizing Large Join Queries. SIGMOD, 19(2):312–321, 1990.
  14. Query Optimization by Simulated Annealing. In SIGMOD, pages 9–22, 1987.
  15. Yesdaulet Izenov. Spanning Tree-based Query Plan Enumeration. https://github.com/yizenov/este, 2024.
  16. COMPASS: Online Sketch-based Query Optimization for In-memory Databases. In SIGMOD, pages 106–117, 2021.
  17. Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models. PVLDB, 10(13):2085–2096, 2017.
  18. Learned Cardinalities:Estimating Correlated Joins with Deep Learning. In CIDR, 2019.
  19. Estimating Cardinalities with Deep Sketches. CoRR, arXiv:1904.08223v1, 2019.
  20. Iterative Dynamic Programming: a New Class of Query Optimization Algorithms. TODS, 25(1):43–82, 2000.
  21. Optimization of Nonrecursive Queries. In VLDB, volume 86, pages 128–137, 1986.
  22. Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration. Data Science and Engineering, 6:86–101, 2021.
  23. Optimizing large join queries using a graph-based approach. IEEE Transactions on Knowledge and Data Engineering, 13(2):298–315, 2001.
  24. How Good Are Query Optimizers, Really? PVLDB, 9(3):204–215, 2015.
  25. Cardinality Estimation Done Right: Index-Based Join Sampling. In CIDR, 2017.
  26. Query Optimization Through the Looking Glass, and What We Found Running the Join Order Benchmark. VLDBJ, 27:643–668, 2018.
  27. Cardinality Estimation Using Neural Networks. In CASCON, pages 53–59, 2015.
  28. Guy Lohman. Is Query Optimization a Solved Problem? https://wp.sigmod.org/?p=1075, 2014.
  29. A Black-Box Approach to Query Cardinality Estimation. In CIDR, 2007.
  30. Guido Moerkotte. Building Query Compilers. https://pi3.informatik.unimannheim.de/~moer/querycompiler.pdf.
  31. Counter Strike: Generic Top-Down Join Enumeration for Hypergraphs. PVLDB, 6(14):1822–1833, 2013.
  32. Analysis of Two Existing and One New Dynamic Programming Algorithm for the Generation of Optimal Bushy Join Trees Without Cross Products. In PVLDB, pages 930–941, 2006.
  33. Dynamic Programming Strikes Back. In SIGMOD, pages 539–552, 2008.
  34. Improved Selectivity Estimation by Combining Knowledge from Sampling and Synopses. PVLDB, 9(11):1016–1028, 2018.
  35. Flow-Loss: Learning Cardinality Estimates That Matter. PVLDB, 14(11):2019–2032, 2021.
  36. Thomas Neumann. Query Simplification: Graceful Degradation for Join-Order Optimization. In SIGMOD, pages 403–414, 2009.
  37. Adaptive Optimization of Very Large Join Queries. In SIGMOD, pages 677–692, 2018.
  38. Worst-Case Optimal Join Algorithms: [Extended Abstract]. In PODS, pages 37–48, 2012.
  39. Postgresql. www.postgresql.org.
  40. An Empirical Analysis of Deep Learning for Cardinality Estimation. CoRR, arXiv:1905.06425v2, 2019.
  41. How I Learned to Stop Worrying and Love Re-optimization. In ICDE, pages 1758–1761, 2019.
  42. Selectivity Estimation Without the Attribute Value Independence Assumption. In PVLDB, pages 486–495, 1997.
  43. Greg Rahn. Join Order Benchmark (JOB). https://github.com/gregrahn/join-order-benchmark.
  44. Access Path Selection in a Relational Database Management System. In SIGMOD, pages 23–34, 1979.
  45. Heuristic and Randomized Optimization for the Join Ordering Problem. VLDBJ, 6(3):191–208, 1997.
  46. Arun Swami. Optimization of Large Join Queries: Combining Heuristics and Combinatorial Techniques. In SIGMOD, pages 367–376, 1989.
  47. Optimization of Large Join Queries. In SIGMOD, pages 8–17, 1988.
  48. Big Data Analytics Group Saarland University. mutable: A Database System for Research and Fast Prototyping. https://mutable.uni-saarland.de.
  49. Rapid Bushy Join-order Optimization with Cartesian Products. SIGMOD Record, 25(2):35–46, 1996.
  50. Robustness Metrics for Relational Query Execution Plans. PVLDB, 11(11):1360–1372, 2018.
  51. Cardinality Estimation with Local Deep Learning Models. In aiDM, pages 1–8, 2019.

Summary

We haven't generated a summary for this paper yet.