Papers
Topics
Authors
Recent
Search
2000 character limit reached

DaphneSched: A Scheduler for Integrated Data Analysis Pipelines

Published 3 Aug 2023 in cs.DC | (2308.01607v1)

Abstract: DAPHNE is a new open-source software infrastructure designed to address the increasing demands of integrated data analysis (IDA) pipelines, comprising data management (DM), high performance computing (HPC), and ML systems. Efficiently executing IDA pipelines is challenging due to their diverse computing characteristics and demands. Therefore, IDA pipelines executed with the DAPHNE infrastructure require an efficient and versatile scheduler to support these demands. This work introduces DaphneSched, the task-based scheduler at the core of DAPHNE. DaphneSched is versatile by incorporating eleven task partitioning and three task assignment techniques, bringing the state-of-the-art closer to the state-of-the-practice task scheduling. To showcase DaphneSched's effectiveness in scheduling IDA pipelines, we evaluate its performance on two applications: a product recommendation system and a linear regression model training. We conduct performance experiments on multicore platforms with 20 and 56 cores. The results show that the versatility of DaphneSched enabled combinations of scheduling strategies that outperform commonly used scheduling techniques by up to 13%. This work confirms the benefits of employing DaphneSched for the efficient execution of applications with IDA pipelines.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.