Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improved Parallel Cache-Oblivious Algorithms for Dynamic Programming and Linear Algebra (1809.09330v2)

Published 25 Sep 2018 in cs.DS

Abstract: Emerging non-volatile main memory (NVRAM) technologies provide byte-addressability, low idle power, and improved memory-density, and are likely to be a key component in the future memory hierarchy. However, a critical challenge in achieving high performance is in accounting for the asymmetry that NVRAM writes can be significantly more expensive than NVRAM reads. In this paper, we consider a large class of cache-oblivious algorithms for dynamic programming (DP) and linear algebra, and try to reduce the writes in the asymmetric setting while maintaining high parallelism. To achieve that, our key approach is to show the correspondence between these problems and an abstraction for their computation, which is referred to as the $k$-d grids. Then by showing lower bound and new algorithms for computing $k$-d grids, we show a list of improved cache-oblivious algorithms of many DP recurrences and in linear algebra in the asymmetric setting, both sequentially and in parallel. Surprisingly, even without considering the read-write asymmetry (i.e., setting the write cost to be the same as the read cost in the algorithms), the new algorithms improve the existing cache complexity of many problems. We believe the reason is that the extra level of abstraction of $k$-d grids helps us to better understand the complexity and difficulties of these problems. We believe that the novelty of our framework is of interests and leads to many new questions for future work.

Citations (11)

Summary

We haven't generated a summary for this paper yet.