Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Cost Modeling of Space-filling Curves (2312.16355v1)

Published 26 Dec 2023 in cs.DB

Abstract: A space-filling curve (SFC) maps points in a multi-dimensional space to one-dimensional points by discretizing the multi-dimensional space into cells and imposing a linear order on the cells. This way, an SFC enables the indexing of multi-dimensional data using a one-dimensional index such as a B+-tree. Choosing an appropriate SFC is crucial, as different SFCs have different effects on query performance. Currently, there are two primary strategies: 1) deterministic schemes, which are computationally efficient but often yield suboptimal query performance, and 2) dynamic schemes, which consider a broad range of candidate SFCs based on cost functions but incur significant computational overhead. Despite these strategies, existing methods cannot efficiently measure the effectiveness of SFCs under heavy query workloads and numerous SFC options. To address this problem, we propose means of constant-time cost estimations that can enhance existing SFC selection algorithms, enabling them to learn more effective SFCs. Additionally, we propose an SFC learning method that leverages reinforcement learning and our cost estimation to choose an SFC pattern efficiently. Experimental studies offer evidence of the effectiveness and efficiency of the proposed means of cost estimation and SFC learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Amazon AWS. 2016. https://aws.amazon.com/blogs/big-data/amazon-redshift-engineerings-advanced-table-design-playbook-compound-and-interleaved-sort-keys. Accessed: 2023-10-10.
  2. Apache Hudi. 2021. https://hudi.apache.org/blog/2021/12/29/hudi-zorder-and-hilbert-space-filling-curves. Accessed: 2023-10-10.
  3. Christian Böhm. 2020. Space-filling Curves for High-performance Data Mining. CoRR abs/2008.01684 (2020).
  4. Databricks Engineering Blog. 2018. https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html. Accessed: 2023-10-10.
  5. Christos Faloutsos and Shari Roseman. 1989. Fractals for Secondary Key Retrieval. In PODS. 247–252.
  6. Raphael A. Finkel and Jon Louis Bentley. 1974. Quad Trees: A Data Structure for Retrieval on Composite Keys. Acta Informatica 4, 1 (1974), 1–9.
  7. LMSFC: A Novel Multidimensional Index based on Learned Monotonic Space Filling Curves. PVLDB 16, 10 (2023), 2605–2617.
  8. Applying Convolutional Neural Networks to Data on Unstructured Meshes with Space-Filling Curves. CoRR abs/2011.14820 (2020).
  9. Sequential Model-Based Optimization for General Algorithm Configuration. In International Conference on Learning and Intelligent Optimization. 507–523.
  10. Ibrahim Kamel and Christos Faloutsos. 1994. Hilbert R-tree: An Improved R-tree using Fractals. In VLDB. 500–509.
  11. The Case for Learned Index Structures. In SIGMOD. 489–504.
  12. Warren M. Lam and Jerome M. Shapiro. 1994. A Class of Fast Algorithms for the Peano-Hilbert Space-Filling Curve. In International Conference on Image Processing. 638–641.
  13. Towards Designing and Learning Piecewise Space-Filling Curves. PVLDB 16, 9 (2023), 2158–2171.
  14. Solving the Rubik’s Cube Without Human Knowledge. In ICLR.
  15. Microsoft. 2023. https://learn.microsoft.com/en-us/sql/relational-databases/indexes/indexes?view=sql-server-ver16. Accessed: 2023-10-10.
  16. Playing Atari with Deep Reinforcement Learning. CoRR abs/1312.5602 (2013).
  17. Analysis of the Clustering Properties of the Hilbert Space-Filling Curve. IEEE Transactions on Knowledge and Data Engineering 13, 1 (2001), 124–141.
  18. Learning Multi-Dimensional Indexes. In SIGMOD. 985–1000.
  19. Shoji Nishimura and Haruo Yokota. 2017. QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and Skew-Tolerant Space-Filling Curves. In SIGMOD. 1525–1537.
  20. OpenStreetMap. 2018. OpenStreetMap North America data dump. https://download.geofabrik.de. Accessed: 2023-10-10.
  21. Jack A. Orenstein. 1986. Spatial Query Processing in an Object-Oriented Database System. In SIGMOD. 326–336.
  22. Jack A. Orenstein and T. H. Merrett. 1984. A Class of Data Structures for Associative Searching. In PODS. 181–190.
  23. Towards an Instance-Optimal Z-Index. In AIDB@VLDB.
  24. PostgreSQL. 2023. https://www.postgresql.org/docs/current/indexes-multicolumn.html. Accessed: 2023-10-10.
  25. Effectively Learning Spatial Indices. PVLDB 13, 11 (2020), 2341–2354.
  26. Theoretically Optimal and Empirically Efficient R-trees with Strong Parallelizability. PVLDB 11, 5 (2018), 621–634.
  27. S2 Geometry. 2023. http://s2geometry.io. Accessed: 2023-10-10.
  28. TLC Trip Record Data. 2022. https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page. Accessed: 2023-10-10.
  29. The Effect of Space-filling Curves on the Efficiency of Hand Gesture Recognition Based on sEMG Signals. International Journal of Electrical and Computer Engineering Systems 12, 1 (2021), 23–31.
  30. Learned Index for Spatial Queries. In MDM. 569–574.
  31. Onion Curve: A Space Filling Curve with Near-Optimal Clustering. In ICDE. 1236–1239.
  32. Pan Xu and Srikanta Tirthapura. 2014. Optimality of Clustering Properties of Space-Filling Curves. ACM Transactions on Database Systems 39, 2 (2014), 10:1–27.

Summary

We haven't generated a summary for this paper yet.