DepGraph: Towards Any Structural Pruning (2301.12900v2)

Published 30 Jan 2023 in cs.AI and cs.CV

Abstract: Structural pruning enables model acceleration by removing structurally-grouped parameters from neural networks. However, the parameter-grouping patterns vary widely across different models, making architecture-specific pruners, which rely on manually-designed grouping schemes, non-generalizable to new architectures. In this work, we study a highly-challenging yet barely-explored task, any structural pruning, to tackle general structural pruning of arbitrary architecture like CNNs, RNNs, GNNs and Transformers. The most prominent obstacle towards this goal lies in the structural coupling, which not only forces different layers to be pruned simultaneously, but also expects all removed parameters to be consistently unimportant, thereby avoiding structural issues and significant performance degradation after pruning. To address this problem, we propose a general and {fully automatic} method, \emph{Dependency Graph} (DepGraph), to explicitly model the dependency between layers and comprehensively group coupled parameters for pruning. In this work, we extensively evaluate our method on several architectures and tasks, including ResNe(X)t, DenseNet, MobileNet and Vision transformer for images, GAT for graph, DGCNN for 3D point cloud, alongside LSTM for language, and demonstrate that, even with a simple norm-based criterion, the proposed method consistently yields gratifying performances.

Citations (196)

View on Semantic Scholar

Summary

The paper presents DepGraph, a framework that automates dependency modeling for universal structural pruning in diverse neural architectures.
It employs recursive dependency modeling to group coupled parameters and reduce redundancy across layers.
Experiments show significant speedups, such as a 2.57x acceleration on ResNet-56 while maintaining high accuracy on CIFAR-10.

DepGraph: A Framework for Universal Structural Pruning

The paper presents DepGraph, a framework designed to address the challenge of structural pruning across diverse neural network architectures. Structural pruning is an essential technique for reducing model size and increasing computational efficiency by removing groups of parameters. The necessity for automated and generalized pruning strategies arises from the diversity of parameter-grouping patterns across different models, such as CNNs, RNNs, GNNs, and Transformers.

Core Challenges in Structural Pruning

One of the main obstacles in structural pruning is structural coupling. In complex deep network architectures, the interdependencies between various layers force the simultaneous pruning of multiple layers. A failure to properly account for these dependencies can lead to significant performance degradation or even structural issues, making the model nonfunctional. Traditional pruning methods have relied on manually designed schemes tailored to specific architectures, which are not readily generalizable to new models. This paper addresses the need for a universal solution capable of handling structural pruning across arbitrary architectures.

Dependency Graph: An Automated Solution

The proposed solution, Dependency Graph (DepGraph), aims to model the dependencies between layers in a neural network in a fully automatic manner. DepGraph serves as a transitive reduction of the intricate parameter interdependencies, effectively compressing the redundant grouping matrix into a more manageable graph format. Through recursive dependency modeling, the method groups coupled parameters and identifies the maximum connected components in the network graph, facilitating efficient pruning across any given architecture.

Evaluation and Results

The DepGraph approach is evaluated extensively on various architectures and tasks, such as ResNeXt, DenseNet, MobileNet, Vision Transformers for images; GAT for graphs; DGCNN for 3D point cloud; and LSTM for language processing. The paper reports competitive results, demonstrating substantial acceleration while maintaining accuracy comparable to the state-of-the-art methods. For example, in CNN pruning, DepGraph achieves a 2.57x speedup on ResNet-56 with an accuracy of 93.64% on CIFAR-10, surpassing the unpruned model's accuracy.

Implications and Future Work

DepGraph's universal applicability and automation indicate significant practical and theoretical implications for network compression. By eliminating the need for manual, architecture-specific pruning schemes, DepGraph enhances the generalizability and scalability of structural pruning methods. This advancement is particularly pertinent in edge computing scenarios where resource constraints necessitate efficient model compression.

Future work may explore extending the Dependency Graph to capture higher-order relationships and dependencies that span beyond pairwise layer interactions. Additionally, integrating advanced importance criteria and training techniques could further improve the efficacy of structural pruning.

In conclusion, the DepGraph framework represents a pivotal development in the pursuit of universally applicable and automated structural pruning solutions. Its ability to generalize across diverse architectures without sacrificing performance marks a significant step forward in neural network compression methodologies.