Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Structured Pattern Pruning Using Regularization (2109.08814v1)

Published 18 Sep 2021 in cs.CL and cs.AI

Abstract: Iterative Magnitude Pruning (IMP) is a network pruning method that repeats the process of removing weights with the least magnitudes and retraining the model. When visualizing the weight matrices of LLMs pruned by IMP, previous research has shown that a structured pattern emerges, wherein the resulting surviving weights tend to prominently cluster in a select few rows and columns of the matrix. Though the need for further research in utilizing these structured patterns for potential performance gains has previously been indicated, it has yet to be thoroughly studied. We propose SPUR (Structured Pattern pruning Using Regularization), a novel pruning mechanism that preemptively induces structured patterns in compression by adding a regularization term to the objective function in the IMP. Our results show that SPUR can significantly preserve model performance under high sparsity settings regardless of the language or the task. Our contributions are as follows: (i) We propose SPUR, a network pruning mechanism that improves upon IMP regardless of the language or the task. (ii) We are the first to empirically verify the efficacy of "structured patterns" observed previously in pruning research. (iii) SPUR is a resource-efficient mechanism in that it does not require significant additional computations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Dongjun Park (1 paper)
  2. Geung-Hee Lee (1 paper)

Summary

We haven't generated a summary for this paper yet.