Papers
Topics
Authors
Recent
2000 character limit reached

A Real-Time Digital Twin for Adaptive Scheduling (2512.18894v1)

Published 21 Dec 2025 in cs.DC

Abstract: High-performance computing (HPC) workloads are becoming increasingly diverse, exhibiting wide variability in job characteristics, yet cluster scheduling has long relied on static, heuristic-based policies. In this work we present SchedTwin, a real-time digital twin designed to adaptively guide scheduling decisions using predictive simulation. SchedTwin periodically ingests runtime events from the physical scheduler, performs rapid what-if evaluations of multiple policies using a high-fidelity discrete-event simulator, and dynamically selects the one satisfying the administrator configured optimization goal. We implement SchedTwin as an open-source software and integrate it with the production PBS scheduler. Preliminary results show that SchedTwin consistently outperforms widely used static scheduling policies, while maintaining low overhead (a few seconds per scheduling cycle). These results demonstrate that real-time digital twins offer a practical and effective path toward adaptive HPC scheduling.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.