Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Efficient Compression of Deep Neural Network Checkpoints Based on Prediction and Context Modeling (2506.12000v1)

Published 13 Jun 2025 in cs.LG

Abstract: This paper is dedicated to an efficient compression of weights and optimizer states (called checkpoints) obtained at different stages during a neural network training process. First, we propose a prediction-based compression approach, where values from the previously saved checkpoint are used for context modeling in arithmetic coding. Second, in order to enhance the compression performance, we also propose to apply pruning and quantization of the checkpoint values. Experimental results show that our approach achieves substantial bit size reduction, while enabling near-lossless training recovery from restored checkpoints, preserving the model's performance and making it suitable for storage-limited environments.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Yuriy Kim (1 paper)
  2. Evgeny Belyaev (2 papers)

Summary

We haven't generated a summary for this paper yet.