S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks (2110.08764v1)

Published 17 Oct 2021 in cs.LG and cs.NE

Abstract: We explore a new perspective on adapting the learning rate (LR) schedule to improve the performance of the ReLU-based network as it is iteratively pruned. Our work and contribution consist of four parts: (i) We find that, as the ReLU-based network is iteratively pruned, the distribution of weight gradients tends to become narrower. This leads to the finding that as the network becomes more sparse, a larger value of LR should be used to train the pruned network. (ii) Motivated by this finding, we propose a novel LR schedule, called S-Cyclical (S-Cyc) which adapts the conventional cyclical LR schedule by gradually increasing the LR upper bound (max_lr) in an S-shape as the network is iteratively pruned.We highlight that S-Cyc is a method agnostic LR schedule that applies to many iterative pruning methods. (iii) We evaluate the performance of the proposed S-Cyc and compare it to four LR schedule benchmarks. Our experimental results on three state-of-the-art networks (e.g., VGG-19, ResNet-20, ResNet-50) and two popular datasets (e.g., CIFAR-10, ImageNet-200) demonstrate that S-Cyc consistently outperforms the best performing benchmark with an improvement of 2.1% - 3.4%, without substantial increase in complexity. (iv) We evaluate S-Cyc against an oracle and show that S-Cyc achieves comparable performance to the oracle, which carefully tunes max_lr via grid search.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Shiyu Liu (32 papers)
Chong Min John Tan (1 paper)
Mehul Motani (54 papers)

Citations (3)

View on Semantic Scholar

S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks (2110.08764v1)

Related Papers