Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Parallel and (Nearly) Work-Efficient Dynamic Programming (2404.16314v2)

Published 25 Apr 2024 in cs.DS and cs.DC

Abstract: The idea of dynamic programming (DP), proposed by BeLLMan in the 1950s, is one of the most important algorithmic techniques. However, in parallel, many fundamental and sequentially simple problems become more challenging, and open to a (nearly) work-efficient solution (i.e., the work is off by at most a polylogarithmic factor over the best sequential solution). In fact, sequential DP algorithms employ many advanced optimizations such as decision monotonicity or special data structures, and achieve better work than straightforward solutions. Many such optimizations are inherently sequential, which creates extra challenges for a parallel algorithm to achieve the same work bound. The goal of this paper is to achieve (nearly) work-efficient parallel DP algorithms by parallelizing classic, highly-optimized and practical sequential algorithms. We show a general framework called the Cordon Algorithm for parallel DP algorithms, and use it to solve several classic problems. Our selection of problems includes Longest Increasing Subsequence (LIS), sparse Longest Common Subsequence (LCS), convex/concave generalized Least Weight Subsequence (LWS), Optimal Alphabetic Tree (OAT), and more. We show how the Cordon Algorithm can be used to achieve the same level of optimization as the sequential algorithms, and achieve good parallelism. Many of our algorithms are conceptually simple, and we show some experimental results as proofs-of-concept.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (94)
  1. Alok Aggarwal and Maria Klawe. 1990. Applications of generalized matrix searching to geometric algorithms. Discrete Applied Mathematics 27, 1-2 (1990), 3–23.
  2. Geometric applications of a matrix-searching algorithm. Algorithmica 2, 1 (1987), 195–208.
  3. Dynamic programming with spiking neural computing. In International Conference on Neuromorphic Systems. 1–9.
  4. Aggarwal Alok and Park James. [n.d.]. Notes on searching in multidimensional monotone arrays. In IEEE Symposium on Foundations of Computer Science (FOCS). 497–512.
  5. Parallel dynamic programming for solving the string editing problem on a CGM/BSP. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 275–281.
  6. Efficient parallel algorithms for string editing and related problems. SIAM J. on Computing 19, 5 (1990), 968–988.
  7. Alberto Apostolico and Concettina Guerra. 1987. The longest common subsequence problem revisited. Algorithmica 2 (1987), 315–336.
  8. Thread scheduling for multiprogrammed multiprocessors. Theory of Computing Systems (TOCS) 34, 2 (2001), 115–144.
  9. Constructing trees in parallel. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 421–431.
  10. ADEPT: a domain independent sequence alignment strategy for gpu architectures. BMC bioinformatics 21, 1 (2020), 1–29.
  11. K Nandan Babu and Sanjeev Saxena. 1997. Parallel algorithms for the longest common subsequence problem. In IEEE International Conference on High Performance Computing (HiPC). IEEE, 120–125.
  12. Massively parallel dynamic programming on trees. arXiv preprint arXiv:1809.03685 (2018).
  13. Richard Bellman. 1954. The theory of dynamic programming. Bull. Amer. Math. Soc. 60, 6 (1954), 503–515.
  14. Joinable Parallel Balanced Binary Trees. ACM Transactions on Parallel Computing (TOPC) 9, 2 (2022), 1–41.
  15. ParlayLib — a toolkit for parallel algorithms on shared-memory multicore machines. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 507–509.
  16. Just Join for Parallel Ordered Sets. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  17. Optimal parallel algorithms in the binary-forking model. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 89–102.
  18. Guy E. Blelloch and Yan Gu. 2020. Improved Parallel Cache-Oblivious Algorithms for Dynamic Programming. In SIAM Symposium on Algorithmic Principles of Computer Systems (APOCS).
  19. Parallel Shortest Paths Using Radius Stepping. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 443–454.
  20. Robert D. Blumofe and Charles E. Leiserson. 1999. Scheduling multithreaded computations by work stealing. J. ACM 46, 5 (1999), 720–748.
  21. Mahdi Boroujeni and Saeed Seddighin. 2019. Improved MPC algorithms for edit distance and Ulam distance. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 31–40.
  22. Efficient matrix chain ordering in polylog time. SIAM J. on Computing 27, 2 (1998), 466–490.
  23. Nearly optimal parallel algorithms for longest increasing subsequence. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  24. Kwong-fai Chan and Tak-wah Lam. 1990. Finding least-weight subsequences with fewer processors. In International Symposium on Algorithms. Springer, 318–327.
  25. A fast parallel algorithm for finding the longest common sequence of multiple biosequences. BMC bioinformatics 7 (2006), 1–12.
  26. Autogen: Automatic discovery of cache-oblivious parallel recursive algorithms for solving dynamic programs. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP). 10.
  27. Rezaul A. Chowdhury and Vijaya Ramachandran. 2006. Cache-oblivious dynamic programming. In ACM-SIAM Symposium on Discrete Algorithms (SODA). 591–600.
  28. Rezaul A. Chowdhury and Vijaya Ramachandran. 2008. Cache-efficient dynamic programming algorithms for multicores. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM.
  29. Rezaul A. Chowdhury and Vijaya Ramachandran. 2010. The cache-oblivious gaussian elimination paradigm: theoretical framework, parallelization and experimental evaluation. Theory of Computing Systems (TOCS) 47, 4 (2010), 878–919.
  30. Artur Czumaj. 1992. Parallel algorithm for the matrix chain product problem. (1992).
  31. Sashka Davis. 1998. Hu-Tucker alogorithm for building optimal alphabetic binary search trees. (1998).
  32. Efficient Parallel Output-Sensitive Edit Distance. In 31st Annual European Symposium on Algorithms (ESA 2023) (Leibniz International Proceedings in Informatics (LIPIcs)), Inge Li Gørtz, Martin Farach-Colton, Simon J. Puglisi, and Grzegorz Herman (Eds.), Vol. 274. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 40:1–40:20. https://doi.org/10.4230/LIPIcs.ESA.2023.40
  33. Source Code. https://github.com/ucrparlay/Parallel-Work-Efficient-Dynamic-Programming.
  34. Making data structures persistent. J. Computer and System Sciences 38, 1 (1989), 86–124.
  35. David Eppstein. 1990. Sequence comparison with mixed convex and concave costs. J. Algorithms 11, 1 (1990), 85–101.
  36. Speeding up dynamic programming. In IEEE Symposium on Foundations of Computer Science (FOCS). 488–496.
  37. Sparse dynamic programming. In ACM-SIAM Symposium on Discrete Algorithms (SODA). 513–522.
  38. Sparse dynamic programming I: linear cost functions. J. ACM 39, 3 (1992), 519–545.
  39. Sparse dynamic programming II: convex and concave cost functions. J. ACM 39, 3 (1992), 546–567.
  40. Zvi Galil and Raffaele Giancarlo. 1989. Speeding up dynamic programming with applications to molecular biology. Theoretical Computer Science (TCS) 64, 1 (1989), 107–118.
  41. Zvi Galil and Kunsoo Park. 1989. A linear-time algorithm for concave one-dimensional dynamic programming. (1989).
  42. Zvi Galil and Kunsoo Park. 1992. Dynamic programming with convexity, concavity and sparsity. Theoretical Computer Science 92, 1 (1992), 49–76.
  43. Zvi Galil and Kunsoo Park. 1994. Parallel algorithms for dynamic programming recurrences with more than O(1) dependency. J. Parallel Distrib. Comput. 21, 2 (1994), 213–222.
  44. Adriano M Garsia and Michelle L Wachs. 1977. A new algorithm for minimal binary search trees. SIAM J. Comput. 6, 4 (1977), 622–642.
  45. Parallel Longest Increasing Subsequence and van Emde Boas Trees. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  46. Analysis of Work-Stealing and Parallel Cache Complexity. In SIAM Symposium on Algorithmic Principles of Computer Systems (APOCS). SIAM, 46–60.
  47. Fast Dynamic Programming in Trees in the MPC Model. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 443–453.
  48. Daniel S Hirschberg. 1977. Algorithms for the longest common subsequence problem. J. ACM 24, 4 (1977), 664–675.
  49. Daniel S Hirschberg and Lawrence L. Larmore. 1987. The least weight subsequence problem. SIAM J. on Computing 16, 4 (1987), 628–638.
  50. Te C Hu and Alan C Tucker. 1971. Optimal computer search trees and variable-length alphabetical codes. SIAM J. Appl. Math. 21, 4 (1971), 514–532.
  51. Parallel dynamic programming. IEEE International Parallel and Distributed Processing Symposium (IPDPS) 5, 3 (1994), 326–328.
  52. David A. Huffman. 1952. A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40, 9 (1952), 1098–1101.
  53. James W Hunt and Thomas G Szymanski. 1977. A fast algorithm for computing longest common subsequences. Commun. ACM 20, 5 (1977), 350–353.
  54. Efficient massively parallel methods for dynamic programming. In ACM Symposium on Theory of Computing (STOC). 798–811.
  55. Alon Itai. 1976. Optimal alphabetic trees. SIAM J. Comput. 5, 1 (1976), 9–18.
  56. Deriving divide-and-conquer dynamic programming algorithms using solver-aided transformations. In Symposium on Object-oriented Programming, Systems, Languages and Applications (OOPSLA). 145–164.
  57. Toward efficient architecture-independent algorithms for dynamic programs: poster. In Symposium on Principles and Practice of Parallel Programming (PPoPP). 413–414.
  58. Correctness of constructing optimal alphabetic trees revisited. Theoretical Computer Science 180, 1-2 (1997), 309–324.
  59. Maria M. Klawe. 1989. A simple linear time algorithm for concave one-dimensional dynamic programming. University of British Columbia Vancouver.
  60. Maria M Klawe and Daniel J Kleitman. 1990. An almost linear time algorithm for generalized matrix searching. SIAM Journal on Discrete Mathematics 3, 1 (1990), 81–97.
  61. Donald E. Knuth. 1971. Optimum binary search trees. Acta informatica 1 (1971), 14–25.
  62. Donald E. Knuth. 1973. The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley.
  63. Donald E. Knuth and Michael F. Plass. 1981. Breaking paragraphs into lines. Software: Practice and Experience 11, 11 (1981), 1119–1184.
  64. Peter Krusche and Alexander Tiskin. 2009. Parallel longest increasing subsequences in scalable time and memory. In International Conference on Parallel Processing and Applied Mathematics. Springer, 176–185.
  65. Peter Krusche and Alexander Tiskin. 2010. New algorithms for efficient parallel string comparison. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 209–216.
  66. Lawrence L Larmore and Teresa M Przytycka. 1995. Constructing Huffman trees in parallel. SIAM J. on Computing 24, 6 (1995), 1163–1169.
  67. Lawrence L Larmore and Teresa M Przytycka. 1996. A parallel algorithm for optimum height-limited alphabetic binary trees. J. Parallel and Distrib. Comput. 35, 1 (1996), 49–56.
  68. Lawrence L Larmore and Teresa M Przytycka. 1998. The optimal alphabetic tree problem revisited. Journal of Algorithms 28, 1 (1998), 1–20.
  69. Parallel construction of optimal alphabetic trees. In ACM symposium on Parallel Algorithms and Architectures (SPAA). 214–223.
  70. Lawrence L Larmore and Wojciech Rytter. 1994. An optimal sublinear time parallel algorithm for some dynamic programming problems. Inform. Process. Lett. 52, 1 (1994), 31–34.
  71. Lawrence L Larmore and Baruch Schieber. 1991. On-line dynamic programming with applications to the prediction of RNA secondary structure. J. Algorithms 12, 3 (1991), 490–515.
  72. A parallel dynamic programming algorithm for multi-reservoir system optimization. Advances in water resources 67 (2014), 1–15.
  73. Mi Lu and Hua Lin. 1994. Parallel algorithms for the longest common subsequence problem. IEEE Transactions on Parallel and Distributed Systems 5, 8 (1994), 835–848.
  74. Webb Miller and Eugene W Myers. 1988. Sequence comparison with concave weighting functions. Bulletin of mathematical biology 50 (1988), 97–120.
  75. Gaspard Monge. 1781. Mémoire sur la théorie des déblais et des remblais. Mem. Math. Phys. Acad. Royale Sci. (1781), 666–704.
  76. SV Nagaraj. 1997. Optimal binary search trees. Theoretical Computer Science 188, 1-2 (1997), 1–44.
  77. Wojciech Rytter. 1988. On efficient parallel computations for some dynamic programming problems. Theoretical Computer Science 59, 3 (1988), 297–307.
  78. Many Sequential Iterative Algorithms Can Be Parallel and (Nearly) Work-efficient. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  79. Daniel D Sleator and Robert Endre Tarjan. 1983. A data structure for dynamic trees. J. Computer and System Sciences 26, 3 (1983), 362–391.
  80. Yihan Sun and Guy E Blelloch. 2019. Parallel Range, Segment and Rectangle Queries with Augmented Maps. In SIAM Symposium on Algorithm Engineering and Experiments (ALENEX). 159–173.
  81. PAM: Parallel Augmented Maps. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP).
  82. Yuan Tang and Shiyi Wang. 2017. Brief Announcement: STAR (Space-Time Adaptive and Reductive) Algorithms for Dynamic Programming Recurrences with More Than O(1) Dependency. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 279–281.
  83. Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP). 205–214.
  84. Efficient CGM-based parallel algorithms for the longest common subsequence problem with multiple substring-exclusion constraints. Parallel Comput. 91 (2020), 102598.
  85. High-performance energy-efficient recursive dynamic programming with matrix-multiplication-like flexible kernels. In IEEE International Parallel and Distributed Processing Symposium (IPDPS). 303–312.
  86. Parallel Longest Common SubSequence Analysis In Chapel. In 2023 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1–6.
  87. Jan Van Leeuwen. 1976. On the Construction of Huffman Trees. In ICALP. 382–410.
  88. Haizhou Wang and Mingzhou Song. 2011. Ckmeans. 1d. dp: optimal k-means clustering in one dimension by dynamic programming. The R journal 3, 2 (2011), 29.
  89. Elad Weiss and Oded Schwartz. 2019. Computation of Matrix Chain Products on Parallel Machines. In International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 491–500.
  90. Robert Wilber. 1988. The concave least-weight subsequence problem revisited. J. Algorithms 9, 3 (1988), 418–425.
  91. Fast parallel algorithms for the longest common subsequence problem using an optical bus. In International Conference on Computational Science and Its Applications. Springer, 338–348.
  92. DP-Nets: Dynamic programming assisted quantization schemes for DNN compression and acceleration. Integration 82 (2022), 147–154.
  93. F. Frances Yao. 1980. Efficient dynamic programming using quadrangle inequalities. In ACM Symposium on Theory of Computing (STOC). 429–435.
  94. F Frances Yao. 1982. Speed-up in dynamic programming. SIAM Journal on Algebraic Discrete Methods 3, 4 (1982), 532–540.
Citations (2)

Summary

We haven't generated a summary for this paper yet.