Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Analyzing Query Optimizer Performance in the Presence and Absence of Cardinality Estimates (2311.17293v1)

Published 29 Nov 2023 in cs.DB

Abstract: Most query optimizers rely on cardinality estimates to determine optimal execution plans. While traditional databases such as PostgreSQL, Oracle, and Db2 utilize many types of synopses -- including histograms, samples, and sketches -- recent main-memory databases like DuckDB and Heavy.AI often operate with minimal or no estimates, yet their performance does not necessarily suffer. To the best of our knowledge, no analytical comparison has been conducted between optimizers with and without cardinality estimates to understand their performance characteristics in different settings, such as indexed, non-indexed, and multi-threaded. In this paper, we present a comparative analysis between optimizers that use cardinality estimates and those that do not. We use the Join Order Benchmark (JOB) for our evaluation and true cardinalities as the baseline. Our investigation reveals that cardinality estimates have marginal impact in non-indexed settings. Meanwhile, when indexes are available, inaccurate estimates may lead to sub-optimal physical operators -- even with an optimal join order. Furthermore, the impact of cardinality estimates is less significant in highly-parallel main-memory databases.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Apache Calcite. https://calcite.apache.org.
  2. DuckDB. https://duckdb.org/.
  3. Heavy.AI. https://www.heavy.ai/.
  4. MonetDB. www.monetdb.org.
  5. Oracle query optimization. https://www.oreilly.com/library/view/oracle-essentials-oracle9i/0596001797/ch04s07.html.
  6. Presto. prestodb.io/.
  7. PostgreSQL. www.postgresql.org, 2023.
  8. Peter Boncz. The IMDB Dataset. http://homepages.cwi.nl/~boncz/job/imdb.tgz.
  9. Tpc-h analyzed: Hidden messages and lessons learned from an influential benchmark. pages 61–76, 01 2014.
  10. Walter Cai. Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities. https://github.com/waltercai/pqo-opensource.
  11. Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities. In SIGMOD 2019, pages 18–35.
  12. Praire: A rule specification framework for query optimizers. In Proceedings of the Eleventh International Conference on Data Engineering, ICDE ’95, page 201–210, USA, 1995. IEEE Computer Society.
  13. Asoke Datta. Analyzing the Impact of Cardinality Estimates on Query Optimization. https://github.com/Asoke26/Estimates-and-No-Estimates/.
  14. Simpli-squared: A very simple yet unexpectedly powerful join ordering algorithm without cardinality estimates. CoRR, abs/2111.00163, 2021.
  15. Database Systems: The Complete Book. Prentice Hall, 2008.
  16. The exodus optimizer generator. In Proceedings of the 1987 ACM SIGMOD International Conference on Management of Data, SIGMOD ’87, page 160–172, New York, NY, USA, 1987. Association for Computing Machinery.
  17. The volcano optimizer generator: Extensibility and efficient search. In Proceedings of the Ninth International Conference on Data Engineering, page 209–218, USA, 1993. IEEE Computer Society.
  18. Yuxing Han. Cardinality estimation in DBMS: a comprehensive benchmark evaluation. https://github.com/Nathaniel-Han/End-to-End-CardEst-Benchmark.
  19. Cardinality estimation in dbms: A comprehensive benchmark evaluation. Proc. VLDB Endow., 15(4):752–765, dec 2021.
  20. Cardinality estimation in dbms: A comprehensive benchmark evaluation, 2021.
  21. Ingres: A relational data base system. AFIPS ’75, New York, NY, USA, 1975. Association for Computing Machinery.
  22. Simplicity Done Right for Join Ordering. In CIDR 2021.
  23. Turbo-charging spj query plans with learned physical join operator selections. Proc. VLDB Endow., page 2706–2718, jul 2022.
  24. On the optimal nesting order for computing n-relational joins. ACM Transactions on Database Systems, 9(3):482–502, 1984.
  25. Yannis E. Ioannidis. Query Optimization. ACM Comput. Surv., 28(1):121–123, 1996.
  26. Left-deep vs. bushy trees: An analysis of strategy spaces and its implications for query optimization. In Proceedings of the 1991 ACM SIGMOD International Conference on Management of Data, SIGMOD ’91, page 168–177, New York, NY, USA, 1991. Association for Computing Machinery.
  27. COMPASS: Online Sketch-based Query Optimization for In-Memory Databases. In SIGMOD 2021, pages 804–816.
  28. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In CIDR 2019.
  29. Learning to Optimize Join Queries With Deep Reinforcement Learning. CoRR, arXiv:1808.03196v2, 2018.
  30. How Good Are Query Optimizers, Really? PVLDB, 9(3):204–215, 2015.
  31. Query Optimization Through the Looking Glass, and What We Found Running the Join Order Benchmark. VLDB Journal, 27:643–668, 2018.
  32. Cardinality Estimation Using Neural Networks. In CASCON 2015, pages 53–59.
  33. A Black-Box Approach to Query Cardinality Estimation. In CIDR 2007.
  34. Neo: A Learned Query Optimizer. PVLDB, 12(11):1705–1718, 2019.
  35. Deep Reinforcement Learning for Join Order Enumeration. In aiDM 2018.
  36. Counter strike: Generic top-down join enumeration for hypergraphs. PVLDB, 6(14):1822–1833, 2013.
  37. Dynamic programming strikes back. In SIGMOD, pages 539–552, 2008.
  38. The Star Schema Benchmark and Augmented Fact Table Indexing, page 237–252. Springer-Verlag, Berlin, Heidelberg, 2009.
  39. An Empirical Analysis of Deep Learning for Cardinality Estimation. CoRR, arXiv:1905.06425v2, 2019.
  40. Extensible/rule based query rewrite optimization in starburst. SIGMOD Rec., 21(2):39–48, June 1992.
  41. Why you should run tpc-ds: A workload analysis. In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB ’07, page 1138–1149. VLDB Endowment, 2007.
  42. Greg Rahn. Join Order Benchmark (JOB). https://github.com/gregrahn/join-order-benchmark.
  43. Access Path Selection in a Relational Database Management System. In SIGMOD 1979, pages 23–34.
  44. Heuristic and Randomized Optimization for the Join Ordering Problem. VLDB Journal, 6(3):191–208, 1997.
  45. Join order selection - good enough is easy. In British National Conference on Databases, 2000.
  46. Are we ready for learned cardinality estimation?, 2021.
  47. Rule-based query optimization, revisited. CIKM ’99, New York, NY, USA, 1999. Association for Computing Machinery.
  48. Cardinality Estimation with Local Deep Learning Models. In aiDM 2019, pages 1–8.
Citations (1)

Summary

We haven't generated a summary for this paper yet.