2000 character limit reached
Simulating Performance of ML Systems with Offline Profiling (2002.06790v1)
Published 17 Feb 2020 in cs.DC and cs.LG
Abstract: We advocate that simulation based on offline profiling is a promising approach to better understand and improve the complex ML systems. Our approach uses operation-level profiling and dataflow based simulation to ensure it offers a unified and automated solution for all frameworks and ML models, and is also accurate by considering the various parallelization strategies in a real system.