Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads

Published 21 May 2021 in cs.DC, cs.SY, and eess.SY | (2105.10312v1)

Abstract: In order to satisfy timing constraints, modern real-time applications require massively parallel accelerators such as General Purpose Graphic Processing Units (GPGPUs). Generation after generation, the number of computing clusters made available in novel GPU architectures is steadily increasing, hence, investigating suitable scheduling approaches is now mandatory. Such scheduling approaches are related to mapping different and concurrent compute kernels within the GPU computing clusters, hence grouping GPU computing clusters into schedulable partitions. In this paper we propose novel techniques to define GPU partitions; this allows us to define suitable task-to-partition allocation mechanisms in which tasks are GPU compute kernels featuring different timing requirements. Such mechanisms will take into account the interference that GPU kernels experience when running in overlapping time windows. Hence, an effective and simple way to quantify the magnitude of such interference is also presented. We demonstrate the efficiency of the proposed approaches against the classical techniques that considered the GPU as a single, non-partitionable resource.

Abstract PDF Upgrade to Chat

Citations (7)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (5)

Collections

Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (5)

Collections