DropNet: Reducing Neural Network Complexity via Iterative Pruning (2207.06646v1)

Published 14 Jul 2022 in cs.LG and cs.AI

Abstract: Modern deep neural networks require a significant amount of computing time and power to train and deploy, which limits their usage on edge devices. Inspired by the iterative weight pruning in the Lottery Ticket Hypothesis, we propose DropNet, an iterative pruning method which prunes nodes/filters to reduce network complexity. DropNet iteratively removes nodes/filters with the lowest average post-activation value across all training samples. Empirically, we show that DropNet is robust across diverse scenarios, including MLPs and CNNs using the MNIST, CIFAR-10 and Tiny ImageNet datasets. We show that up to 90% of the nodes/filters can be removed without any significant loss of accuracy. The final pruned network performs well even with reinitialization of the weights and biases. DropNet also has similar accuracy to an oracle which greedily removes nodes/filters one at a time to minimise training loss, highlighting its effectiveness.

PDF Abstract

A Critical Analysis of DropNet: Reducing Neural Network Complexity via Iterative Pruning

The paper "DropNet: Reducing Neural Network Complexity via Iterative Pruning" by John Tan Chong Min and Mehul Motani presents a refined approach to reducing the computational complexity of neural networks through an iterative pruning technique known as DropNet. This method is of considerable interest to the machine learning research community, operating within a context where the demand for efficient yet powerful neural network models is increasing, especially for deployment on edge devices with limited computational resources.

Methodology Overview

DropNet leverages the principle of iterative pruning, a concept inspired by the Lottery Ticket Hypothesis, to systematically reduce neural network complexity. It does this by iteratively removing nodes or filters with the lowest average post-activation value—a metric that quantifies the relevance of nodes/filters based on their activation values across training samples. This approach is generalized across different network architectures such as Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) and applies to datasets like MNIST, CIFAR-10, and Tiny ImageNet. Unlike many traditional pruning strategies that depend on specific weight reinitialization techniques, DropNet maintains robust performance following random reinitialization, facilitating broader applicability across existing machine learning frameworks.

Quantitative and Empirical Findings

The empirical results presented in the paper are compelling. DropNet's iterative pruning algorithm achieves up to 90% reduction in network parameters without significant loss in accuracy, a result verified across several network architectures and datasets. Specifically, for CNNs, DropNet sustains its competitive edge even when up to 80% of the filters are pruned. The method demonstrates accuracy retention comparable to an oracle that engages in greedily minimizing training loss while pruning nodes and filters one at a time.

Theoretical and Practical Implications

DropNet's pruning strategy, which is inherently data-driven, challenges the efficacy of traditional metrics such as APoZ by incorporating the average magnitude of post-activation values. This makes it particularly robust across varying architectural depths and provides an adaptable framework that can be readily deployed in resource-constrained environments, achieving significant reductions in energy consumption and computational overhead.

Furthermore, the paper’s analysis suggests a broader theoretical implication: pruning decisions made via layer-wise selection metrics (minimum_layer) perform better than global metrics (minimum), particularly in larger models such as ResNet18 and VGG19. This insight could drive refinements in future network designs, where architectural layers might be structured to better exploit layer-wise statistical non-uniformities.

Future Directions

The findings open avenues for further research into extending DropNet’s methodology to other neural network paradigms, such as Recurrent Neural Networks (RNNs) and Transformer-based models. Exploring alternative activation functions and their interaction with DropNet’s pruning metric could also reveal new optimization strategies.

Ultimately, DropNet presents a significant contribution to the optimization of neural networks, offering practical benefits and advancing theoretical understanding of complexity reduction in deep learning. This research effectively sets the groundwork for developing efficient neural network architectures better suited for an increasingly broad array of applications.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

John Tan Chong Min (3 papers)
Mehul Motani (54 papers)

Citations (46)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos