Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Parallel Integer Sort: Theory and Practice (2401.00710v1)

Published 1 Jan 2024 in cs.DS and cs.DC

Abstract: Integer sorting is a fundamental problem in computer science. This paper studies parallel integer sort both in theory and in practice. In theory, we show tighter bounds for a class of existing practical integer sort algorithms, which provides a solid theoretical foundation for their widespread usage in practice and strong performance. In practice, we design a new integer sorting algorithm, \textsf{DovetailSort}, that is theoretically-efficient and has good practical performance. In particular, \textsf{DovetailSort} overcomes a common challenge in existing parallel integer sorting algorithms, which is the difficulty of detecting and taking advantage of duplicate keys. The key insight in \textsf{DovetailSort} is to combine algorithmic ideas from both integer- and comparison-sorting algorithms. In our experiments, \textsf{DovetailSort} achieves competitive or better performance than existing state-of-the-art parallel integer and comparison sorting algorithms on various synthetic and real-world datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. 2010. OpenStreetMap © OpenStreetMap contributors. https://www.openstreetmap.org/.
  2. Sarita V Adve and Mark D Hill. 1990. Weak ordering—a new definition. In ACM International Symposium on Computer Architecture (ISCA), Vol. 18. 2–14.
  3. Susanne Albers and Torben Hagerup. 1997. Improved parallel integer sorting without concurrent writing. Information and Computation 136, 1 (1997), 25–51.
  4. Sorting in Linear Time? J. Computer and System Sciences 57, 1 (1998), 74–93.
  5. Thread Scheduling for Multiprogrammed Multiprocessors. Theory of Computing Systems (TOCS) 34, 2 (01 Apr 2001).
  6. Engineering in-place (shared-memory) sorting algorithms. ACM Transactions on Parallel Computing (TOPC) 9, 1 (2022), 1–62.
  7. Group formation in large social networks: membership, growth, and evolution. In ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD). 44–54.
  8. Improved deterministic parallel integer sorting. Information and Computation 94, 1 (1991), 29–47.
  9. When Are Cache-Oblivious Algorithms Cache Adaptive? A Case Study of Matrix Multiplication and Sorting. In European Symposium on Algorithms (ESA).
  10. ParlayLib — a toolkit for parallel algorithms on shared-memory multicore machines. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 507–509.
  11. Guy E Blelloch and Magdalen Dobson. 2022. Parallel Nearest Neighbors in Low Dimensions with Batch Updates. In Algorithm Engineering and Experiments (ALENEX). SIAM, 195–208.
  12. Just Join for Parallel Ordered Sets. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  13. Internally deterministic parallel algorithms can be fast. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP). 181–192.
  14. Optimal parallel algorithms in the binary-forking model. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 89–102.
  15. Low depth cache-oblivious algorithms. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  16. Parallelism in Randomized Incremental Algorithms. J. ACM 67, 5 (2020), 1–27.
  17. A Comparison of Sorting Algorithms for the Connection Machine CM-2. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  18. Robert D. Blumofe and Charles E. Leiserson. 1998. Space-Efficient Scheduling of Multithreaded Computations. SIAM J. on Computing 27, 1 (1998).
  19. PARADIS: an efficient parallel algorithm for in-place radix sort. Proceedings of the VLDB Endowment (PVLDB) 8, 12 (2015), 1518–1529.
  20. Introduction to Algorithms (3rd edition). MIT Press.
  21. Theoretically efficient parallel graph algorithms can be fast and scalable. ACM Transactions on Parallel Computing (TOPC) 8, 1 (2021), 1–70.
  22. DovetailSort: A Parallel Integer Sort Algorithm. https://github.com/ucrparlay/DovetailSort.
  23. High-Performance and Flexible Parallel Algorithms for Semisort and Related Problems. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  24. Junhao Gan and Yufei Tao. 2017. On the hardness and approximation of Euclidean DBSCAN. ACM Transactions on Database Systems (TODS) 42, 3 (2017), 1–45.
  25. Memory consistency and event ordering in scalable shared-memory multiprocessors. In ACM International Symposium on Computer Architecture (ISCA), Vol. 18. ACM New York, NY, USA, 15–26.
  26. Michael T Goodrich and Riko Jacob. 2023. Optimal Parallel Sorting with Comparison Errors. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 355–365.
  27. David Gries and Harlan Mills. 1981. Swapping sections. Technical Report. Cornell University.
  28. Ray specialized contraction on bounding volume hierarchies. In Computer Graphics Forum, Vol. 34. 309–318.
  29. Efficient BVH construction via approximate agglomerative clustering. In High-Performance Graphics (HPG).
  30. Analysis of Work-Stealing and Parallel Cache Complexity. In SIAM Symposium on Algorithmic Principles of Computer Systems (APOCS). SIAM, 46–60.
  31. Parallel In-Place Algorithms: Theory and Practice. In SIAM Symposium on Algorithmic Principles of Computer Systems (APOCS). 114–128.
  32. A Top-Down Parallel Semisort. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 24–34.
  33. Torben Hagerup. 1991. Constant-time parallel integer sorting. In ACM Symposium on Theory of Computing (STOC). 299–306.
  34. Hiroshi Inoue and Kenjiro Taura. 2015. SIMD-and cache-friendly algorithm for sorting an array of structures. Proceedings of the VLDB Endowment 8, 11 (2015), 1274–1285.
  35. ispan: Parallel identification of strongly connected components with spanning trees. In International Conference for High Performance Computing, Networking, Storage, and Analysis (SC). IEEE, 731–742.
  36. Jyrki Katajainen and Jesper Larsson Träff. 1997. A meticulous analysis of mergesort programs. In Italian Conference on Algorithms and Complexity (ICAC). Springer, 217–228.
  37. Sorting data on ultra-large scale with RADULS: New incarnation of radix sort. In Beyond Databases, Architectures and Structures (BDAS). Springer, 235–245.
  38. Even faster sorting of (not only) integers. In Man-Machine Interactions 5: 5th International Conference on Man-Machine Interactions, ICMMI 2017 Held at Kraków, Poland, October 3-6, 2017. Springer, 481–491.
  39. What is Twitter, a social network or a news media?. In International World Wide Web Conference (WWW). 591–600.
  40. Scalable clustering algorithm for N-body simulations in a shared-nothing cluster. In International Conference on Scientific and Statistical Database Management. Springer, 132–150.
  41. Partitioned parallel radix sort. J. Parallel Distrib. Comput. 62, 4 (2002), 656–668.
  42. k𝑘kitalic_kANN on the GPU with shifted sorting. In High-Performance Graphics (HPG).
  43. Yossi Matias and Uzi Vishkin. 1991. On parallel hashing and integer sorting. J. Algorithms 12, 4 (1991), 573–606.
  44. Web Data Commons — Hyperlink Graphs. http://webdatacommons.org/hyperlinkgraph.
  45. Theoretically-Efficient and Practical Parallel In-Place Radix Sorting. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 213–224.
  46. Orestis Polychroniou and Kenneth A Ross. 2014. A comprehensive study of main-memory partitioning and its application to large-scale comparison-and radix-sort. In ACM SIGMOD International Conference on Management of Data (SIGMOD). 755–766.
  47. Sanguthevar Rajasekaran and John H. Reif. 1989. Optimal and sublogarithmic time randomized parallel sorting algorithms. SIAM J. on Computing 18, 3 (1989), 594–607.
  48. I/O chunking and latency hiding approach for out-of-core sorting acceleration using GPU and flash NVM. In IEEE International Conference on Big Data (Big Data). 398–403.
  49. GPU-accelerated large-scale distributed sorting coping with device memory capacity. IEEE Trans. on Big Data 2, 1 (2016), 57–69.
  50. Xipeng Shen and Chen Ding. 2004. Adaptive data partition for sorting using probability distribution. In International Conference on Parallel Processing (ICPP). IEEE, 250–257.
  51. BFS and coloring-based parallel algorithms for strongly connected components and related problems. In IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 550–559.
  52. Edgar Solomonik and Laxmikant V Kale. 2010. Highly scalable parallel sorting. In IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 1–12.
  53. On supporting efficient snapshot isolation for hybrid workloads with multi-versioned indexes. Proceedings of the VLDB Endowment (PVLDB) 13, 2 (2019), 211–225.
  54. Jesper Larsson Träff. 2018. Parallel quicksort without pairwise element exchange. arXiv preprint:1804.07494 (2018).
  55. Uzi Vishkin. 2010. Thinking in parallel: Some basic data-parallel algorithms and techniques. (2010).
  56. Parallel Strong Connectivity Based on Faster Reachability. In ACM SIGMOD International Conference on Management of Data (SIGMOD).
  57. Theoretically-Efficient and Practical Parallel DBSCAN. In ACM SIGMOD International Conference on Management of Data (SIGMOD). 2555–2571.
  58. GeoGraph: A Framework for Graph Processing on Geometric Data. ACM SIGOPS Operating Systems Review 55, 1 (2021), 38–46.
  59. Jan Wassenberg and Peter Sanders. 2011. Engineering a multi-core radix sort. In European Conference on Parallel Processing (Euro-Par). Springer, 160–169.
  60. In-place permuting and perfect shuffling using involutions. Inform. Process. Lett. 113, 10-11 (2013), 386–391.
  61. Keliang Zhang and Baifeng Wu. 2012. A novel parallel approach of radix sort with bucket partition preprocess. In International Conference on High Performance Computing (HPCC). IEEE, 989–994.
  62. Learning transportation mode from raw gps data for geographic applications on the web. In International World Wide Web Conference (WWW). 247–256.
Citations (1)

Summary

We haven't generated a summary for this paper yet.