Papers
Topics
Authors
Recent
Search
2000 character limit reached

On the performance of various parallel GMRES implementations on CPU and GPU clusters

Published 10 Jun 2019 in cs.DC | (1906.04051v1)

Abstract: As the need for computational power and efficiency rises, parallel systems become increasingly popular among various scientific fields. While multiple core-based architectures have been the center of attention for many years, the rapid development of general purposes GPU-based architectures takes high performance computing to the next level. In this work, different implementations of a parallel version of the preconditioned GMRES - an established iterative solver for large and sparse linear equation sets - are presented, each of them on different computing architectures: From distributed and shared memory core-based to GPU-based architectures. The computational experiments emanate from the dicretization of a benchmark boundary value problem with the finite element method. Major advantages and drawbacks of the various implementations are addressed in terms of parallel speedup, execution time and memory issues. Among others, comparison of the results in the different architectures, show the high potentials of GPU-based architectures.

Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.