Content-Aware GAN Compression (2104.02244v1)

Published 6 Apr 2021 in cs.CV

Abstract: Generative adversarial networks (GANs), e.g., StyleGAN2, play a vital role in various image generation and synthesis tasks, yet their notoriously high computational cost hinders their efficient deployment on edge devices. Directly applying generic compression approaches yields poor results on GANs, which motivates a number of recent GAN compression works. While prior works mainly accelerate conditional GANs, e.g., pix2pix and CycleGAN, compressing state-of-the-art unconditional GANs has rarely been explored and is more challenging. In this paper, we propose novel approaches for unconditional GAN compression. We first introduce effective channel pruning and knowledge distillation schemes specialized for unconditional GANs. We then propose a novel content-aware method to guide the processes of both pruning and distillation. With content-awareness, we can effectively prune channels that are unimportant to the contents of interest, e.g., human faces, and focus our distillation on these regions, which significantly enhances the distillation quality. On StyleGAN2 and SN-GAN, we achieve a substantial improvement over the state-of-the-art compression method. Notably, we reduce the FLOPs of StyleGAN2 by 11x with visually negligible image quality loss compared to the full-size model. More interestingly, when applied to various image manipulation tasks, our compressed model forms a smoother and better disentangled latent manifold, making it more effective for image editing.

Authors (6)

Yuchen Liu (156 papers)
Zhixin Shu (37 papers)
Yijun Li (56 papers)
Zhe Lin (163 papers)
Federico Perazzi (22 papers)
S. Y. Kung (7 papers)

Citations (53)

View on Semantic Scholar

Summary

Content-Aware GAN Compression: Enhancing Efficiency and Maintaining Quality

The paper "Content-Aware GAN Compression" presents a series of novel approaches for compressing unconditional generative adversarial networks (GANs) with a focus on improving deployment efficiency on edge devices, such as mobile phones, while maintaining image quality. The authors introduce advanced channel pruning and knowledge distillation techniques tailored specifically for this purpose.

Methodology Overview

GANs like StyleGAN2 are resource-intensive, requiring substantial computational power that challenges their deployment on resource-constrained devices. Traditional generic compression techniques have proved ineffective for GANs, motivating the pursuit of specialized solutions. The proposed methods in this paper address this challenge through effective channel pruning and knowledge distillation that are content-aware, thereby selectively focusing on important image regions.

Channel Pruning:
- The paper introduces two pruning metrics: $\ell1$ -out and content-aware $\ell1$ -out. The latter incorporates spatial information about the generated image contents, effectively identifying redundant channels by evaluating their sensitivity to content-related noise perturbations.
Knowledge Distillation:
- This is employed to guide the student network in mimicking a teacher network’s outputs. The paper leverages both pixel-level and perceptual-level losses, including LPIPS, to distill knowledge effectively. The content-aware approach further concentrates on distillation within semantically important image regions, enhancing image quality and representation.

Results and Performance

The proposed compression methods demonstrate significant improvements on both SN-GAN and StyleGAN2 models over existing techniques, such as GAN-Slimming. Notably, for StyleGAN2, the paper reports a reduction in FLOPs by 11 $\times$ while achieving a visually negligible image quality loss. Furthermore, the compressed models exhibit superior latent space smoothness, translating to qualitative improvements in image editing and manipulation tasks, such as style mixing and morphing.

On CIFAR-10 (SN-GAN), the introduced techniques achieve comparable or superior inception scores at both 2 $\times$ and 4 $\times$ acceleration, showing a distinct improvement over baseline models like GS.
On FFHQ dataset with StyleGAN2, the 256px and 1024px models reveal the compressed versions maintaining low FID scores, thus preserving image fidelity effectively. At the same time, the compressed models offer enhanced efficiency, evidenced by significant reductions in computational requirements.

Implications and Future Directions

The implications of this research are multifaceted. Practically, the approach makes deploying state-of-the-art GANs on edge devices more feasible, broadening the potential applications of GAN-generated content in real-time environments. Theoretically, the research advances understanding of content-aware network manipulation, suggesting avenues for integrating semantic awareness in model designs more broadly.

Future developments could explore extending content-aware compression strategies to other types of generative models beyond GANs, possibly benefiting areas like variational autoencoders or diffusion models. Additionally, refining content selection criteria and exploring alternative distillation losses could further enhance both quality assurance and compression efficiency. The smooth latent manifolds demonstrated in compressed GANs also suggest potential for improved performance in tasks requiring semantic continuity, such as video generation and transition effects in media.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos