Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data-free parameter pruning for Deep Neural Networks (1507.06149v1)

Published 22 Jul 2015 in cs.CV

Abstract: Deep Neural nets (NNs) with millions of parameters are at the heart of many state-of-the-art computer vision systems today. However, recent works have shown that much smaller models can achieve similar levels of performance. In this work, we address the problem of pruning parameters in a trained NN model. Instead of removing individual weights one at a time as done in previous works, we remove one neuron at a time. We show how similar neurons are redundant, and propose a systematic way to remove them. Our experiments in pruning the densely connected layers show that we can remove upto 85\% of the total parameters in an MNIST-trained network, and about 35\% for AlexNet without significantly affecting performance. Our method can be applied on top of most networks with a fully connected layer to give a smaller network.

Data-Free Parameter Pruning for Deep Neural Networks

The paper "Data-free Parameter Pruning for Deep Neural Networks" by Suraj Srinivas and R. Venkatesh Babu introduces a novel approach to the problem of parameter pruning in trained deep neural networks (DNNs). Unlike previous methods that focus on removing individual weights, this paper proposes a neuron-centric approach, pruning one neuron at a time. This technique is particularly beneficial for reducing the complexity of densely connected layers in deep learning models without significantly affecting their performance.

Key Contributions

The authors challenge the conventional weight-based pruning by demonstrating that removing entire neurons by identifying redundant ones results in a more streamlined model. Their method stands out as it does not rely on any training or validation data for the compression process. The paper highlights two significant outcomes:

  1. Parameter Reduction: The proposed method achieves up to 85% reduction in parameters for a neural network trained on the MNIST dataset and approximately 35% for an AlexNet model, with negligible loss in accuracy.
  2. Network Applicability: This technique can be implemented on any network with fully connected layers, offering a versatile solution to neural network complexity.

Methodological Insights

The method's core involves analyzing neuron redundancy, where similar neurons are found to be surplus to requirements. By leveraging a systematic pruning strategy, the authors eliminate redundant neurons effectively. The pruning process involves:

  • Identifying neurons with similar activation functions and weights.
  • Implementing a 'surgery' step that alters coefficients to compensate for the removed neurons, preserving network structure.

The theoretical underpinning is strengthened by linking the proposed approach to principles like Hebbian theory, where neurons that 'fire together, wire together.' The paper illustrates this with a toy model, showcasing how identical weight sets can be fused without impacting the output.

Implications and Comparisons

The authors draw comparisons with traditional techniques such as Optimal Brain Damage (OBD) and Knowledge Distillation (KD). They acknowledge the similarities and differences in approach, noting that while OBD provides finer control, it is computationally intensive. In contrast, their method's efficiency and lack of data reliance present a clear advantage, especially for large-scale networks.

Experimental Validation

The pruning method's efficacy is validated through experiments on both the MNIST dataset and AlexNet architecture. The results are compelling, showing substantial model compression with minimal accuracy degradation. The paper provides detailed comparative data against naive methods and random pruning, underscoring the superiority of their approach.

For practical utility, the authors propose a heuristic to determine the optimal number of neurons to prune by analyzing changes in saliency during the pruning process. This enables adaptive control over the model's complexity based on empirical data rather than fixed criteria.

Theoretical and Practical Implications

From a theoretical standpoint, this research contributes to the broader understanding of parameter redundancy in neural networks, offering insights into efficient model simplification. Practically, the implications are significant for deploying DNNs in resource-constrained environments, where model size and evaluation speed are critical.

Future Directions

Potential future work includes extending the method to convolutional layers and further refining heuristic strategies for data-free pruning. Integrating this technique with other model compression methods could yield even more optimized architectures.

In summary, this paper provides a robust, data-free pruning strategy that holds promise for enhancing model efficiency while maintaining performance. The approach presents a valuable contribution to neural network optimization, with broad applicability across various architectures.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Suraj Srinivas (28 papers)
  2. R. Venkatesh Babu (108 papers)
Citations (532)