Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Interference analysis of shared last-level cache on embedded GP-GPUs with multiple CUDA streams (2310.04848v1)

Published 7 Oct 2023 in cs.DC

Abstract: In modern heterogeneous architectures, the access to data that the application needs is a key factor, in order to make the compute task efficient, in terms of power dissipation and execution time. The new generation SoCs are equipped with large LLCs, in order to make data access as efficient as possible. However, these systems introduce a new level of complexity in terms of the system's predictability, because concurrent tasks must compete for the same resource and contribute to generating interference between them. This paper aims to provide a preliminary qualitative analysis in terms of interference degree that is generated when several concurrent streams are in execution, for example one that performs useful computing tasks and one that generates interference. Specifically, we tested two important primitives: vadd and gemm, respectively subjected to interference with: i) a concurrent kernel that performs read from shared memory. ii) concurrent stream that performs host-to-device memory copy.

Summary

We haven't generated a summary for this paper yet.