Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning (1912.08881v3)

Published 18 Dec 2019 in cs.LG, cs.NE, and stat.ML

Abstract: The success of convolutional neural networks (CNNs) in various applications is accompanied by a significant increase in computation and parameter storage costs. Recent efforts to reduce these overheads involve pruning and compressing the weights of various layers while at the same time aiming to not sacrifice performance. In this paper, we propose a novel criterion for CNN pruning inspired by neural network interpretability: The most relevant units, i.e. weights or filters, are automatically found using their relevance scores obtained from concepts of explainable AI (XAI). By exploring this idea, we connect the lines of interpretability and model compression research. We show that our proposed method can efficiently prune CNN models in transfer-learning setups in which networks pre-trained on large corpora are adapted to specialized tasks. The method is evaluated on a broad range of computer vision datasets. Notably, our novel criterion is not only competitive or better compared to state-of-the-art pruning criteria when successive retraining is performed, but clearly outperforms these previous criteria in the resource-constrained application scenario in which the data of the task to be transferred to is very scarce and one chooses to refrain from fine-tuning. Our method is able to compress the model iteratively while maintaining or even improving accuracy. At the same time, it has a computational cost in the order of gradient computation and is comparatively simple to apply without the need for tuning hyperparameters for pruning.

PDF Abstract

Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning

The paper "Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning" proposes an innovative methodology for optimizing Convolutional Neural Networks (CNNs) by leveraging concepts from explainable AI (XAI). The authors introduce a pruning criterion based on Layer-wise Relevance Propagation (LRP), which assesses the importance of network units—such as weights or filters—using relevance scores. This approach aligns model interpretability and compression efforts, providing a theoretically grounded solution that distinguishes itself from traditional heuristic-based methods.

Key Insights and Experimental Results

The success of CNNs across various domains, such as image classification and medical diagnostics, has led to models with substantial computational and storage costs. The paper identifies the opportunity within these high-capacity models: many parameters, while facilitating learning, contribute minimally to task-specific predictive performance once training is complete. Hence, pruning offers a path to reducing model complexity without significantly compromising accuracy.

The core innovation in this paper is the use of relevance scores obtained from LRP for assessing the significance of network elements. Notably, the method circumvents the need for extra heuristic steps or hyperparameter tuning for pruning, presenting a scalable approach suitable for large-scale applications. The authors demonstrate that LRP-based pruning effectively balances compression and performance, outperforming existing criteria, particularly in resource-constrained scenarios requiring transfer learning without fine-tuning.

The LRP criterion maintains model accuracy while reducing floating-point operations per inference and storage requirements. Comparative experiments against weight-based, gradient-based, and Taylor expansion criteria reveal that LRP consistently conserves model functionality across diverse datasets and architectures, including VGG, AlexNet, and ResNet models. Moreover, its ability to localize and preserve essential neurons in all network layers without explicit regularization further validates LRP's robustness and efficiency.

Implications and Future Research Directions

Practically, this method enhances the applicability of neural networks on devices with limited computational resources, such as mobile and embedded systems. The LRP criterion's capacity to prune models pre-trained on extensive datasets, like ILSVRC~2012, making them adaptable to specialized tasks with scarce data, reflects its potential in democratizing AI capabilities across various deployment environments.

Theoretically, linking interpretability with pruning harnesses explainable AI’s strengths to address computational bottlenecks, paving the way for further exploration into unified frameworks for model optimization and explanation. The fusion approach could guide deeper investigations into neural network functionality, shedding light on valuable pathways and models’ decision-making processes.

Future research may delve into adaptation for other network architectures, exploring possible synergies with other explainability methods. Additionally, assessing different LRP variants' impacts on model compression and interpretability could render more tailored solutions across application domains.

In conclusion, the paper presents a compelling case for leveraging explainable AI techniques in optimizing neural networks, striking a balance between performance and resource efficiency. While the LRP criterion succeeds in bridging critical gaps in neural network compression, its implications suggest a broader horizon in advancing AI technologies grounded in human-understandable principles.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Seul-Ki Yeom (5 papers)
Philipp Seegerer (4 papers)
Sebastian Lapuschkin (66 papers)
Alexander Binder (38 papers)
Simon Wiedemann (12 papers)
Klaus-Robert Müller (167 papers)
Wojciech Samek (144 papers)

Citations (180)

View on Semantic Scholar

Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning (1912.08881v3)