- The paper presents DepGraph, a framework that automates dependency modeling for universal structural pruning in diverse neural architectures.
- It employs recursive dependency modeling to group coupled parameters and reduce redundancy across layers.
- Experiments show significant speedups, such as a 2.57x acceleration on ResNet-56 while maintaining high accuracy on CIFAR-10.
DepGraph: A Framework for Universal Structural Pruning
The paper presents DepGraph, a framework designed to address the challenge of structural pruning across diverse neural network architectures. Structural pruning is an essential technique for reducing model size and increasing computational efficiency by removing groups of parameters. The necessity for automated and generalized pruning strategies arises from the diversity of parameter-grouping patterns across different models, such as CNNs, RNNs, GNNs, and Transformers.
Core Challenges in Structural Pruning
One of the main obstacles in structural pruning is structural coupling. In complex deep network architectures, the interdependencies between various layers force the simultaneous pruning of multiple layers. A failure to properly account for these dependencies can lead to significant performance degradation or even structural issues, making the model nonfunctional. Traditional pruning methods have relied on manually designed schemes tailored to specific architectures, which are not readily generalizable to new models. This paper addresses the need for a universal solution capable of handling structural pruning across arbitrary architectures.
Dependency Graph: An Automated Solution
The proposed solution, Dependency Graph (DepGraph), aims to model the dependencies between layers in a neural network in a fully automatic manner. DepGraph serves as a transitive reduction of the intricate parameter interdependencies, effectively compressing the redundant grouping matrix into a more manageable graph format. Through recursive dependency modeling, the method groups coupled parameters and identifies the maximum connected components in the network graph, facilitating efficient pruning across any given architecture.
Evaluation and Results
The DepGraph approach is evaluated extensively on various architectures and tasks, such as ResNeXt, DenseNet, MobileNet, Vision Transformers for images; GAT for graphs; DGCNN for 3D point cloud; and LSTM for language processing. The paper reports competitive results, demonstrating substantial acceleration while maintaining accuracy comparable to the state-of-the-art methods. For example, in CNN pruning, DepGraph achieves a 2.57x speedup on ResNet-56 with an accuracy of 93.64% on CIFAR-10, surpassing the unpruned model's accuracy.
Implications and Future Work
DepGraph's universal applicability and automation indicate significant practical and theoretical implications for network compression. By eliminating the need for manual, architecture-specific pruning schemes, DepGraph enhances the generalizability and scalability of structural pruning methods. This advancement is particularly pertinent in edge computing scenarios where resource constraints necessitate efficient model compression.
Future work may explore extending the Dependency Graph to capture higher-order relationships and dependencies that span beyond pairwise layer interactions. Additionally, integrating advanced importance criteria and training techniques could further improve the efficacy of structural pruning.
In conclusion, the DepGraph framework represents a pivotal development in the pursuit of universally applicable and automated structural pruning solutions. Its ability to generalize across diverse architectures without sacrificing performance marks a significant step forward in neural network compression methodologies.