Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Layer-adaptive sparsity for the Magnitude-based Pruning (2010.07611v2)

Published 15 Oct 2020 in cs.LG

Abstract: Recent discoveries on neural network pruning reveal that, with a carefully chosen layerwise sparsity, a simple magnitude-based pruning achieves state-of-the-art tradeoff between sparsity and performance. However, without a clear consensus on "how to choose," the layerwise sparsities are mostly selected algorithm-by-algorithm, often resorting to handcrafted heuristics or an extensive hyperparameter search. To fill this gap, we propose a novel importance score for global pruning, coined layer-adaptive magnitude-based pruning (LAMP) score; the score is a rescaled version of weight magnitude that incorporates the model-level $\ell_2$ distortion incurred by pruning, and does not require any hyperparameter tuning or heavy computation. Under various image classification setups, LAMP consistently outperforms popular existing schemes for layerwise sparsity selection. Furthermore, we observe that LAMP continues to outperform baselines even in weight-rewinding setups, while the connectivity-oriented layerwise sparsity (the strongest baseline overall) performs worse than a simple global magnitude-based pruning in this case. Code: https://github.com/jaeho-lee/layer-adaptive-sparsity

Citations (146)

Summary

  • The paper introduces Layer-Adaptive Magnitude Pruning (LAMP), a method that optimizes layerwise sparsity using a novel importance score factoring in model-level distortion.
  • Empirical results show LAMP achieves superior sparsity-accuracy tradeoffs compared to existing magnitude pruning methods across various models and datasets.
  • LAMP reduces computational demands and hyperparameter tuning, making it practical for resource-constrained applications like mobile and embedded systems.

Layer-Adaptive Sparsity for Magnitude-Based Pruning: An In-Depth Analysis

The paper presents Layer-Adaptive Magnitude-Based Pruning (LAMP), an innovative method for neural network pruning that focuses on optimizing the layerwise sparsity in magnitude-based pruning frameworks. This paper addresses the challenge of determining optimal sparsity levels for each layer in a network, which has traditionally been selected through extensive hyperparameter searches or heuristic rules.

Overview

The LAMP method introduces a novel importance score, which is a variant of weight magnitude, factoring in the model-level 2\ell_2 distortion caused by pruning. This scoring system does not require additional hyperparameters nor substantial computation, diverging from traditional methods that often depend on handcrafted heuristics or algorithm-specific criteria. The paper makes a compelling case for LAMP, demonstrating its efficacy across various model architectures and datasets, including popular image classification networks (e.g., VGG-16, ResNet-18/34, DenseNet-121, EfficientNet-B0) and datasets (CIFAR-10/100, SVHN, Restricted ImageNet).

Key Contributions

  1. LAMP Score: The paper introduces the LAMP score, which adjusts the typical magnitude pruning by considering the layer-specific impact of weight removal on the overall model performance. This involves a strategic scaling of weight magnitudes to approximate the model-level distortion.
  2. Superior Performance: Empirical results show that LAMP consistently provides superior sparsity-accuracy tradeoffs compared to existing magnitude-based pruning methods, achieving better performance even when integrated with weight-rewinding setups.
  3. Adaptive Sparsity: LAMP's ability to automatically determine appropriate layerwise sparsity without predefined heuristics or significant recalibration positions it as a versatile technique suited to a range of neural network architectures and operational scenarios.
  4. Practical Implementations: The LAMP method delivers practicality by eschewing the need for excessive computational overhead and hyperparameter tuning—both common bottlenecks in existing pruning frameworks.

Detailed Analysis

The paper undertakes a comprehensive evaluation of LAMP, showcasing its efficacy in different pruning strategies such as one-shot pruning and iterative pruning. Additionally, LAMP's performance is compared against various baseline methods across different model architectures and datasets, maintaining its edge particularly in complex models like EfficientNet-B0.

The paper also examines layerwise sparsity patterns, revealing that LAMP tends to conserve a relatively uniform number of non-zero connections throughout the model—a strategy speculated to enhance memory capacity and expressive power under strict sparsity constraints.

Implications & Future Directions

The reduction in computational demands and the robust performance outcomes suggest that LAMP has substantial implications for resource-constrained applications, such as mobile and embedded systems. Its adaptability also points to potential expansions into domains beyond image recognition, including natural language processing and edge computing.

Moving forward, the theoretical implications of LAMP—particularly its ability to approximate output distortion within sparse networks—warrant further exploration. Such investigations could advance the understanding of neural network capacity and the development of more principled, robust pruning methodologies.

In conclusion, LAMP represents a significant progression in the construction of efficient neural networks, highlighting the ongoing need for advanced methods to optimize the balance between model performance and computational resource allocation.

Github Logo Streamline Icon: https://streamlinehq.com