An Improved Scheduling with Advantage Actor-Critic for Storm Workloads (2312.04126v1)
Abstract: Various resources as the essential elements of data centers, and the completion time is vital to users. In terms of the persistence, the periodicity and the spatial-temporal dependence of stream workload, a new Storm scheduler with Advantage Actor-Critic is proposed to improve resource utilization for minimizing the completion time. A new weighted embedding with a Graph Neural Network is designed to depend on the features of a job comprehensively, which includes the dependence, the types and the positions of tasks in a job. An improved Advantage Actor-Critic integrating task chosen and executor assignment is proposed to schedule tasks to executors in order to better resource utilization. Then the status of tasks and executors are updated for the next scheduling. Compared to existing methods, experimental results show that the proposed Storm scheduler improves resource utilization. The completion time is reduced by almost 17\% on the TPC-H data set and reduced by almost 25\% on the Alibaba data set.
- Alibaba, 2018. Cluster data collected from production clusters in alibaba. https://github.com/alibaba/clusterdata/cluster-trace-v2018 .
- Online scheduling of dependent tasks of cloud’s workflows to enhance resource utilization and reduce the makespan using multiple reinforcement learning-based agents. Soft Computing 24, 16177–16199.
- Task scheduling, resource provisioning, and load balancing on scientific workflows using parallel sarsa reinforcement learning agents and genetic algorithm. The Journal of Supercomputing 77, 2800–2828.
- Online scheduling and interference alleviation for low-latency, high-throughput processing of data streams. IEEE Transactions on Parallel and Distributed Systems 28, 3553–3569.
- Workload characterization and optimization of tpc-h queries on apache spark, in: 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), IEEE. pp. 112–121.
- Fair scheduling algorithms in grids. IEEE Transactions on Parallel and Distributed Systems 18, 1630–1648.
- Reducing makespans of dag scheduling through interleaving overlapping resource utilization, in: 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), IEEE. pp. 392–400.
- A novel hybrid of shortest job first and round robin with dynamic variable quantum time task scheduling technique. Journal of Cloud computing 6, 1–12.
- I-scheduler: Iterative scheduling for distributed stream processing systems. Future Generation Computer Systems 117, 219–233.
- Spear: Optimized dependency-aware task scheduling with deep reinforcement learning, in: 2019 IEEE 39th international conference on distributed computing systems (ICDCS), IEEE. pp. 2037–2046.
- Performance and cost-efficient spark job scheduling based on deep reinforcement learning in cloud computing environments. IEEE Transactions on Parallel and Distributed Systems 33, 1695–1710.
- Improving resource utilization by timely fine-grained scheduling, in: Proceedings of the Fifteenth European Conference on Computer Systems, pp. 1–16.
- Reinforcement learning based scheduling in a workflow management system. Engineering Applications of Artificial Intelligence 81, 94–106.
- Learning scheduling algorithms for data processing clusters, in: Proceedings of the ACM special interest group on data communication, pp. 270–288.
- Variance reduction for reinforcement learning in input-driven environments. arXiv preprint arXiv:1807.02264 .
- Ml-driven classification scheme for dynamic interference-aware resource scheduling in cloud infrastructures. Journal of Systems Architecture 116, 102064.
- Ban-storm: a bandwidth-aware scheduling mechanism for stream jobs. Journal of Grid Computing 19, 1–16.
- Pattern recognition and machine learning. Journal of electronic imaging 16, 049901.
- Deep learning with tensorflow: A review. Journal of Educational and Behavioral Statistics 45, 227–248.
- Pac: Preference-aware co-location scheduling on heterogeneous numa architectures to improve resource utilization, in: Proceedings of the 37th International Conference on Supercomputing, pp. 75–86.
- R-storm: Resource-aware scheduling in storm, in: Proceedings of the 16th annual middleware conference, pp. 149–161.
- A dynamic cache-partition schedulability analysis for partitioned scheduling on multicore real-time systems. IEEE Letters of the Computer Society 3, 46–49.
- Enhanced multi-verse optimizer for task scheduling in cloud computing environments. Expert Systems with Applications 168, 114230.
- Lr-stream: Using latency and resource aware scheduling to improve latency and throughput for streaming applications. Future Generation Computer Systems 114, 243–258.
- Effective scheduling algorithm for load balancing in fog environment using cnn and mpso. Knowledge and Information Systems 64, 773–797.
- Energy utilization task scheduling for mapreduce in heterogeneous clusters. IEEE Transactions on Services Computing 15, 931–944. doi:10.1109/TSC.2020.2966697.
- Multi-objective workflow scheduling based on genetic algorithm in cloud environment. Information Sciences 606, 38–59.
- Horus: Interference-aware and prediction-based scheduling in deep learning systems. IEEE Transactions on Parallel and Distributed Systems 33, 88–100.
- Ppo-based pdacb traffic control scheme for massive iov communications. IEEE Transactions on Intelligent Transportation Systems 24, 1116–1125.