Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Discrimination-aware Network Pruning for Deep Model Compression (2001.01050v2)

Published 4 Jan 2020 in cs.CV
Discrimination-aware Network Pruning for Deep Model Compression

Abstract: We study network pruning which aims to remove redundant channels/kernels and hence speed up the inference of deep networks. Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones. Both strategies suffer from some limitations: the former kind is computationally expensive and difficult to converge, while the latter kind optimizes the reconstruction error but ignores the discriminative power of channels. In this paper, we propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power. Note that a channel often consists of a set of kernels. Besides the redundancy in channels, some kernels in a channel may also be redundant and fail to contribute to the discriminative power of the network, resulting in kernel level redundancy. To solve this, we propose a discrimination-aware kernel pruning (DKP) method to further compress deep networks by removing redundant kernels. To prevent DCP/DKP from selecting redundant channels/kernels, we propose a new adaptive stopping condition, which helps to automatically determine the number of selected channels/kernels and often results in more compact models with better performance. Extensive experiments on both image classification and face recognition demonstrate the effectiveness of our methods. For example, on ILSVRC-12, the resultant ResNet-50 model with 30% reduction of channels even outperforms the baseline model by 0.36% in terms of Top-1 accuracy. The pruned MobileNetV1 and MobileNetV2 achieve 1.93x and 1.42x inference acceleration on a mobile device, respectively, with negligible performance degradation. The source code and the pre-trained models are available at https://github.com/SCUT-AILab/DCP.

An Analysis of "Discrimination-aware Network Pruning for Deep Model Compression"

The paper "Discrimination-aware Network Pruning for Deep Model Compression" introduces a novel approach to network pruning aimed at enhancing the efficiency of deep neural networks (DNNs) without sacrificing their discriminative performance. This method—Discrimination-aware Channel Pruning (DCP)—addresses the limitations of traditional pruning techniques by focusing on preserving the discriminative power of the model, making it a compelling strategy for deep model compression.

Context and Motivation

Deep neural networks have achieved significant breakthroughs in numerous applications, including image classification and face recognition. However, their deployment on resource-constrained devices remains challenging due to their substantial memory and computational requirements. Traditional compression methods, such as network pruning, have been primarily categorized into two strategies: training-from-scratch with sparsity constraints and reconstruction-based methods. These approaches either suffer from high computational costs and convergence difficulties or fail to maintain the discriminative power of the network by focusing solely on minimizing reconstruction errors.

Discrimination-aware Channel Pruning (DCP)

The paper introduces DCP, which proposes a two-fold strategy: enhancing the discriminative power of intermediate layers and simultaneously considering both discrimination-aware loss and reconstruction error during pruning. The key components of this approach include:

  1. Discrimination-aware Loss: Additional losses are incorporated into the network to ensure that the intermediate representations contribute significantly to the discriminative task. This is a departure from traditional methods that often overlook the discriminative relevance of channels.
  2. Optimization Framework: The pruning process is framed as a sparsity-inducing optimization problem. A greedy algorithm is proposed to solve this, effectively selecting channels that maintain the network’s ability to distinguish between different classes.
  3. Adaptive Stopping Criteria: Two adaptive stopping conditions are proposed to automatically determine the number of selected channels or kernels, effectively balancing model complexity with performance.

Empirical Evaluation

The proposed DCP method is evaluated extensively across various architectures such as ResNet and MobileNet on datasets like CIFAR-10 and ILSVRC-12. The results indicate that DCP can achieve substantial reductions in the number of parameters and FLOPs while occasionally even improving the model's accuracy—a significant achievement indicative of effective channel selection.

For instance, the ResNet-50 model pruned to reduce 30% of its channels not only decreases computational overhead but also marginally improves its Top-1 accuracy by 0.36%. Additionally, the pruned MobileNet models exhibit considerable inference acceleration on a Qualcomm Snapdragon 845 processor, demonstrating the method’s practical applicability in mobile environments.

Discrimination-aware Kernel Pruning (DKP)

Further extending the concept, the paper explores Discrimination-aware Kernel Pruning (DKP) to address redundancy at the kernel level, proposing a more fine-grained compression strategy. The DKP approach leverages the same principles of discriminative power preservation and indicates significant potential for deeper network compression without notable loss of performance.

Implications and Future Directions

The implications of this research are two-fold: it provides a methodology for more efficient deployment of neural networks on mobile and embedded devices and sets the stage for further advancements in discrimination-aware model compression techniques. Future research could explore extension into quantization alongside pruning, potentially offering further performance gains through simultaneous reduction in bit precision and architectural complexity.

In summary, the paper convincingly demonstrates that discrimination-aware methods for network pruning can be a robust path forward in the ongoing endeavor to optimize neural networks for diverse computational budgets while maintaining high performance. This work lays crucial groundwork for ongoing research into efficient, scalable deep learning models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jing Liu (525 papers)
  2. Bohan Zhuang (79 papers)
  3. Zhuangwei Zhuang (7 papers)
  4. Yong Guo (67 papers)
  5. Junzhou Huang (137 papers)
  6. Jinhui Zhu (6 papers)
  7. Mingkui Tan (124 papers)
Citations (105)
Github Logo Streamline Icon: https://streamlinehq.com