On Efficient Constructions of Checkpoints (2009.13003v1)

Published 28 Sep 2020 in cs.LG and stat.ML

Abstract: Efficient construction of checkpoints/snapshots is a critical tool for training and diagnosing deep learning models. In this paper, we propose a lossy compression scheme for checkpoint constructions (called LC-Checkpoint). LC-Checkpoint simultaneously maximizes the compression rate and optimizes the recovery speed, under the assumption that SGD is used to train the model. LC-Checkpointuses quantization and priority promotion to store the most crucial information for SGD to recover, and then uses a Huffman coding to leverage the non-uniform distribution of the gradient scales. Our extensive experiments show that LC-Checkpoint achieves a compression rate up to $28\times$ and recovery speedup up to $5.77\times$ over a state-of-the-art algorithm (SCAR).

View on arXiv

Authors (4)

Yu Chen (506 papers)
Zhenming Liu (30 papers)
Bin Ren (136 papers)
Xin Jin (285 papers)

Citations (10)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

On Efficient Constructions of Checkpoints (2009.13003v1)

Summary

Related Papers