Disaggregated Accelerator Management System for Cloud Data Centers (2010.13594v1)
Abstract: A conventional data center that consists of monolithic-servers is confronted with limitations including lack of operational flexibility, low resource utilization, low maintainability, etc. Resource disaggregation is a promising solution to address the above issues. We propose a concept of disaggregated cloud data center architecture called Flow-in-Cloud (FiC) that enables an existing cluster computer system to expand an accelerator pool through a high-speed network. FlowOS-RM manages the entire pool resources, and deploys a user job on a dynamically constructed slice according to a user request. This slice consists of compute nodes and accelerators where each accelerator is attached to the corresponding compute node. This paper demonstrates the feasibility of FiC in a proof of concept experiment running a distributed deep learning application on the prototype system. The result successfully warrants the applicability of the proposed system.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.