Interference analysis of shared last-level cache on embedded GP-GPUs with multiple CUDA streams (2310.04848v1)

Published 7 Oct 2023 in cs.DC

Abstract: In modern heterogeneous architectures, the access to data that the application needs is a key factor, in order to make the compute task efficient, in terms of power dissipation and execution time. The new generation SoCs are equipped with large LLCs, in order to make data access as efficient as possible. However, these systems introduce a new level of complexity in terms of the system's predictability, because concurrent tasks must compete for the same resource and contribute to generating interference between them. This paper aims to provide a preliminary qualitative analysis in terms of interference degree that is generated when several concurrent streams are in execution, for example one that performs useful computing tasks and one that generates interference. Specifically, we tested two important primitives: vadd and gemm, respectively subjected to interference with: i) a concurrent kernel that performs read from shared memory. ii) concurrent stream that performs host-to-device memory copy.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Interference analysis of shared last-level cache on embedded GP-GPUs with multiple CUDA streams (2310.04848v1)

Summary

Related Papers