Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CrAFT: Compression-Aware Fine-Tuning for Efficient Visual Task Adaptation (2305.04526v2)

Published 8 May 2023 in cs.CV

Abstract: Transfer learning has become a popular task adaptation method in the era of foundation models. However, many foundation models require large storage and computing resources, which makes off-the-shelf deployment impractical. Post-training compression techniques such as pruning and quantization can help lower deployment costs. Unfortunately, the resulting performance degradation limits the usability and benefits of such techniques. To close this performance gap, we propose CrAFT, a simple fine-tuning framework that enables effective post-training network compression. In CrAFT, users simply employ the default fine-tuning schedule along with sharpness minimization objective, simultaneously facilitating task adaptation and compression-friendliness. Contrary to the conventional sharpness minimization techniques, which are applied during pretraining, the CrAFT approach adds negligible training overhead as fine-tuning is done in under a couple of minutes or hours with a single GPU. The effectiveness of CrAFT, which is a general-purpose tool that can significantly boost one-shot pruning and post-training quantization, is demonstrated on both convolution-based and attention-based vision foundation models on a variety of target tasks. The code will be made publicly available.

Summary

We haven't generated a summary for this paper yet.