Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enabling predictable parallelism in single-GPU systems with persistent CUDA threads (2310.01212v1)

Published 2 Oct 2023 in cs.DC

Abstract: Graphics Processing Unit, or GPUs, have been successfully adopted both for graphic computation in 3D applications, and for general purpose application (GP-GPUs), thank to their tremendous performance-per-watt. Recently, there is a big interest in adopting them also within automotive and avionic industrial settings, imposing for the first time real-time constraints on the design of such devices. Unfortunately, it is extremely hard to extract timing guarantees from modern GPU designs, and current approaches rely on a model where the GPU is treated as a unique monolithic execution device. Unlike state-of-the-art of research, we try to open the box of modern GPU architectures, providing a clean way to exploit intra-GPU predictable execution.

Summary

We haven't generated a summary for this paper yet.