Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Ranked Enumeration for Database Queries (2409.08142v2)

Published 12 Sep 2024 in cs.DB

Abstract: Ranked enumeration is a query-answering paradigm where the query answers are returned incrementally in order of importance (instead of returning all answers at once). Importance is defined by a ranking function that can be specific to the application, but typically involves either a lexicographic order (e.g., "ORDER BY R.A, S.B" in SQL) or a weighted sum of attributes (e.g., "ORDER BY 3*R.A + 2*S.B"). Recent work has introduced any-k algorithms for (multi-way) join queries, which push ranking into joins and avoid materializing intermediate results until necessary. The top-ranked answers are returned asymptotically faster than the common join-then-rank approach of database systems, resulting in orders-of-magnitude speedup in practice. In addition to their practical usefulness, these techniques complement a long line of theoretical research on unranked enumeration, where answers are also returned incrementally, but with no explicit ordering requirement. For a broad class of ranking functions with certain monotonicity properties, including lexicographic orders and sum-based rankings, the ordering requirement surprisingly does not increase the asymptotic time or space complexity, apart from logarithmic factors. A key insight is the connection between ranked enumeration for database queries and the fundamental task of computing the kth-shortest path in a graph. Although this connection is important for grounding the problem in the literature, it can obfuscate the simplicity of the algorithm. In this article, we adopt a pragmatic approach and present a slightly simplified version of the algorithm without the shortest-path interpretation. We believe that this will benefit practitioners looking to implement and optimize any-k approaches.

Summary

We haven't generated a summary for this paper yet.