2000 character limit reached
Model Compression
Published 20 May 2021 in cs.LG | (2105.10059v2)
Abstract: With time, machine learning models have increased in their scope, functionality and size. Consequently, the increased functionality and size of such models requires high-end hardware to both train and provide inference after the fact. This paper aims to explore the possibilities within the domain of model compression, discuss the efficiency of combining various levels of pruning and quantization, while proposing a quality measurement metric to objectively decide which combination is best in terms of minimizing the accuracy delta and maximizing the size reduction factor.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.