The Knowledge Within: Methods for Data-Free Model Compression (1912.01274v2)

Published 3 Dec 2019 in cs.LG, cs.CV, and stat.ML

Abstract: Recently, an extensive amount of research has been focused on compressing and accelerating Deep Neural Networks (DNN). So far, high compression rate algorithms require part of the training dataset for a low precision calibration, or a fine-tuning process. However, this requirement is unacceptable when the data is unavailable or contains sensitive information, as in medical and biometric use-cases. We present three methods for generating synthetic samples from trained models. Then, we demonstrate how these samples can be used to calibrate and fine-tune quantized models without using any real data in the process. Our best performing method has a negligible accuracy degradation compared to the original training set. This method, which leverages intrinsic batch normalization layers' statistics of the trained model, can be used to evaluate data similarity. Our approach opens a path towards genuine data-free model compression, alleviating the need for training data during model deployment.

Citations (101)

View on Semantic Scholar

Summary

The paper introduces synthetic sample generation techniques that enable model compression without the need for real training data.
It demonstrates that the BN-Statistics scheme maintains model accuracy on benchmarks like CIFAR and ImageNet with negligible degradation.
The approach promotes privacy-preserving and resource-efficient AI deployments by substituting sensitive data with synthetic calibration samples.

Understanding Data-Free Model Compression with Synthetic Samples

Why Do We Need Data-Free Compression?

Deep Neural Networks (DNNs) achieve impressive results but often at the cost of requiring substantial computational resources and storage. Compressing these models, particularly through quantization, helps us deploy them in resource-limited environments. However, traditional methods typically need access to some training data for fine-tuning or calibration to minimize accuracy degradation. This becomes problematic when the data isn't available or is sensitive, like in medical or biometric applications.

The Big Idea: Synthetic Samples for Calibration and Fine-Tuning

This paper explores a new avenue—using synthetic samples generated from trained models to avoid relying on real data for model compression tasks. The authors delve into three synthetic data generation methods:

Gaussian Scheme: Samples are drawn randomly from a Gaussian distribution.
Inception Scheme: Uses a technique called logit maximization to generate samples.
BN-Statistics Scheme: Minimizes the divergence between synthetic data batch normalization statistics and those from the real training data.

These synthetic samples can then be used for tasks like quantization calibration and knowledge distillation (KD) without data access.

How Do These Methods Work?

Gaussian Scheme

Here, samples are drawn from a simple Gaussian distribution, mimicking the original data's first and second moments. This can work surprisingly well for calibration, but it often falls short when extreme compression demands start to degrade model performance.

Inception Scheme

This starts with an arbitrary input and updates it to maximize the output response for a particular class through back-propagation. While it can produce high-confidence outputs, it requires careful tuning of hyperparameters to avoid poor results.

BN-Statistics Scheme

This novel method leverages pre-trained model batch normalization statistics to generate synthetic samples. The idea is to minimize the difference between the synthetic and real data statistics, maintaining the internal model consistency even without real data.

Practical Results

Small-Scale Experiments

The paper's experiments on CIFAR-10 and CIFAR-100 illustrate that the BN-Statistics Scheme provides synthetic samples that closely match real data in maintaining model accuracy post-calibration. Fine-tuning with these synthetic samples showed negligible accuracy degradation.

Large-Scale Experiments

Conducting experiments on ImageNet with models like ResNet-18 and MobileNet-V2, the authors found that synthetic samples generated using the BN-Statistics Scheme performed on par with real data, even for demanding quantization settings. That's a significant step forward, demonstrating the practical feasibility of data-free model compression.

What Could This Mean for the Future?

The implications of this research are far-reaching:

Data Privacy: By removing the reliance on training data, it's easier to comply with privacy regulations.
Wider Deployment: Compressed models can be deployed in more environments without compromising on performance.
Resource Efficiency: Synthetic data generation could be computationally intensive but can be optimized for improved efficiency over time.

The paper's success also points to new research directions, like refining these methods for better performance and exploring their use in cross-dataset or cross-model transfer settings.

Final Thoughts

While the generation of high-quality synthetic samples without any real data is still a burgeoning area, this paper makes a compelling case for its feasibility and effectiveness. It invites data scientists and AI researchers to reimagine data-free model training and deployment, paving the way for more robust, privacy-preserving AI applications.

With continued advancements, the hope is to reach a point where the need for real training data in model deployment is minimal, safeguarding privacy while maintaining excellent performance across various AI applications.

PDF Markdown