Portability of Fortran do concurrent GPU offload beyond NVIDIA
Determine the extent to which the performance and correctness results achieved when offloading Fortran’s do concurrent (DC) loops to NVIDIA GPUs extend to other GPU vendors, specifically Intel and AMD GPUs using their respective compilers and toolchains.
References
While there have been previous promising results on the NVIDIA platform (see the next section), how well they extend to other vendors is an open question.
— Portability of Fortran's `do concurrent' on GPUs
(2408.07843 - Caplan et al., 2024) in Section 1 Introduction