Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Adaptive Performance-oriented Scheduler for Static and Dynamic Heterogeneity (1905.00673v2)

Published 2 May 2019 in cs.DC

Abstract: With the emergence of heterogeneous hardware paving the way for the post-Moore era, it is of high importance to adapt the runtime scheduling to the platform's heterogeneity. To enhance adaptive and responsive scheduling, we introduce a Performance Trace Table (PTT) into XiTAO, a framework for elastic scheduling of mixed-mode parallelism. The PTT is an extensible and dynamic lightweight manifest of the per-core latency that can be used to guide the scheduling of both critical and non-critical tasks. By understanding the per-task latency, the PTT can infer task performance, intra-application interference as well as inter-application interference. We run random Direct Acyclic Graphs (DAGs) of different workload categories as a benchmark on NVIDIA Jetson TX2 chip, achieving up to 3.25x speedup over a standard work-stealing scheduler. To exemplify scheduling adaption to interference, we run DAGs with high parallelism and analyze the scheduler's response to interference from a background process on an Intel Haswell (2650v3) multicore workstation. We also showcase the XiTAO's scheduling performance by porting the VGG-16 image classification framework based on Convolutional Neural Networks (CNN).

Summary

We haven't generated a summary for this paper yet.