Fast and Simple Sorting Using Partial Information (2404.04552v3)
Abstract: We consider the problem of sorting $n$ items, given the outcomes of $m$ pre-existing comparisons. We present a simple and natural deterministic algorithm that runs in $O(m+\log T)$ time and does $O(\log T)$ comparisons, where $T$ is the number of total orders consistent with the pre-existing comparisons. Our running time and comparison bounds are best possible up to constant factors, thus resolving a problem that has been studied intensely since 1976 (Fredman, Theoretical Computer Science). The best previous algorithm with a bound of $O(\lg T)$ on the number of comparisons has a time bound of $O(n{2.5})$ and is more complicated. Our algorithm combines three classic algorithms: topological sort, heapsort with the right kind of heap, and efficient search in a sorted list. It outputs the items in sorted order one by one. It can be modified to stop early, thereby solving the important and more general top-$k$ sorting problem: Given $k$ and the outcomes of some pre-existing comparisons, output the smallest $k$ items in sorted order. The modified algorithm solves the top-$k$ sorting problem in minimum time and comparisons, to within constant factors.
- “Using TPA to count linear extensions” In arXiv preprint arXiv:1010.4981, 2010
- Prosenjit Bose, John Howat and Pat Morin “A history of distribution-sensitive data structures” In Space-Efficient Data Structures, Streams, and Algorithms: Papers in Honor of J. Ian Munro on the Occasion of His 66th Birthday Springer, 2013, pp. 133–149
- Graham Brightwell “Balanced pairs in partial orders” In Discrete Mathematics 201.1-3 Elsevier, 1999, pp. 25–52
- “Counting linear extensions is #P-complete” In Proceedings of the twenty-third annual ACM symposium on Theory of computing, 1991, pp. 175–181
- Graham R Brightwell, Stefan Felsner and William T Trotter “Balancing pairs and the cross product conjecture” In Order 12 Springer, 1995, pp. 327–349
- “Faster random generation of linear extensions” In Discrete mathematics 201.1-3 Elsevier, 1999, pp. 81–88
- “On generalized comparison-based sorting problems” In Space-Efficient Data Structures, Streams, and Algorithms: Papers in Honor of J. Ian Munro on the Occasion of His 66th Birthday Springer, 2013, pp. 164–175
- “Sorting under partial information (without the ellipsoid algorithm)” In Proceedings of the forty-second ACM symposium on Theory of computing, 2010, pp. 359–368
- “An efficient algorithm for partial order production” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009 ACM, 2009, pp. 93–100
- “Top-k sorting under partial order information” In Proceedings of the 2018 International Conference on Management of Data, 2018, pp. 1007–1019
- Martin Dyer, Alan Frieze and Ravi Kannan “A random polynomial-time algorithm for approximating the volume of convex bodies” In Journal of the ACM (JACM) 38.1 ACM New York, NY, USA, 1991, pp. 1–17
- Amr Elmasry “A priority queue with the working-set property” In International Journal of Foundations of Computer Science 17.06 World Scientific, 2006, pp. 1455–1465
- Amr Elmasry, Arash Farzan and John Iacono “On the hierarchy of distribution-sensitive properties for data structures” In Acta informatica 50.4 Springer, 2013, pp. 289–295
- Robert W Floyd “Algorithm 245: treesort” In Communications of the ACM 7.12 ACM New York, NY, USA, 1964, pp. 701
- Michael L Fredman “How good is the information theory bound in sorting?” In Theoretical Computer Science 1.4 Elsevier, 1976, pp. 355–361
- “The pairing heap: A new form of self-adjusting heap” In Algorithmica 1.1-4 Springer, 1986, pp. 111–129
- “The Pairing Heap: A New Form of Self-Adjusting Heap” In Algorithmica 1.1, 1986, pp. 111–129 DOI: 10.1007/BF01840439
- “Heaps with the Working-Set Bound” Preprint, 2024
- “Universal Optimality of Dijkstra via Beyond-Worst-Case Heaps”, 2023 arXiv:2311.11793 [cs.DS]
- Mark Huber “Fast perfect sampling from linear extensions” In Discrete Mathematics 306.4 Elsevier, 2006, pp. 420–428
- John Iacono “Improved upper bounds for pairing heaps” In Scandinavian Workshop on Algorithm Theory, 2000, pp. 32–45 Springer
- Arthur B Kahn “Topological sorting of large networks” In Communications of the ACM 5.11 ACM New York, NY, USA, 1962, pp. 558–562
- Jeff Kahn and Jeong Han Kim “Entropy and sorting” In Proceedings of the twenty-fourth annual ACM symposium on Theory of computing, 1992, pp. 178–187
- “Balancing extensions via Brunn-Minkowski” In Combinatorica 11.4, 1991, pp. 363–368
- “Balancing poset extensions” In Order 1 Springer, 1984, pp. 113–126
- “On the conductance of order Markov chains” In Order 8 Springer, 1991, pp. 7–15
- Donald E Knuth “The Art of Computer Programming: Fundamental Algorithms, volume 1” Addison-Wesley Professional, 1997
- “Smooth heaps and a dual view of self-adjusting data structures” In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, 2018, pp. 801–814
- “Adaptive heapsort” In Journal of Algorithms 14.3 Elsevier, 1993, pp. 395–413
- Nathan Linial “The information-theoretic bound is good for merging” In SIAM Journal on Computing 13.4 SIAM, 1984, pp. 795–801
- Peter Matthews “Generating a random linear extension of a partial order” In The Annals of Probability 19.3 Institute of Mathematical Statistics, 1991, pp. 1367–1392
- “Dynamic Optimality Refuted–For Tournament Heaps” In arXiv preprint arXiv:1908.00563, 2019
- A. Schönhage “The Production of Partial Orders” In Astérisque 38-39 Soc. Math. France, Paris, 1976, pp. 229–246
- Claude Elwood Shannon “A mathematical theory of communication” In The Bell system technical journal 27.3 Nokia Bell Labs, 1948, pp. 379–423
- Daniel Dominic Sleator and Robert Endre Tarjan “Self-adjusting binary search trees” In Journal of the ACM (JACM) 32.3 ACM New York, NY, USA, 1985, pp. 652–686
- J Williams “Heapsort” In Commun. ACM 7.6, 1964, pp. 347–348
- Andrew Chi-Chih Yao “On the Complexity of Partial Order Productions” In SIAM J. Comput. 18.4, 1989, pp. 679–689