Papers
Topics
Authors
Recent
2000 character limit reached

Efficient Interleaved Batch Matrix Solvers for CUDA

Published 10 Sep 2019 in cs.DC and physics.comp-ph | (1909.04539v2)

Abstract: In this paper we present a new methodology for data accesses when solving batches of Tridiagonal and Pentadiagonal matrices that all share the same LHS matrix. By only storing one copy of this matrix there is a significant reduction in storage overheads and the authors show that there is also a performance increase in terms of compute time. These two results combined lead to an overall more efficient implementation over the current state of the art algorithms cuThomasBatch and cuPentBatch, allowing for a greater number of systems to be solved on a single GPU.

Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.