N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning (1709.06030v2)

Published 18 Sep 2017 in cs.LG and stat.ML

Abstract: While bigger and deeper neural network architectures continue to advance the state-of-the-art for many computer vision tasks, real-world adoption of these networks is impeded by hardware and speed constraints. Conventional model compression methods attempt to address this problem by modifying the architecture manually or using pre-defined heuristics. Since the space of all reduced architectures is very large, modifying the architecture of a deep neural network in this way is a difficult task. In this paper, we tackle this issue by introducing a principled method for learning reduced network architectures in a data-driven way using reinforcement learning. Our approach takes a larger teacher' network as input and outputs a compressedstudent' network derived from the teacher' network. In the first stage of our method, a recurrent policy network aggressively removes layers from the largeteacher' model. In the second stage, another recurrent policy network carefully reduces the size of each remaining layer. The resulting network is then evaluated to obtain a reward -- a score based on the accuracy and compression of the network. Our approach uses this reward signal with policy gradients to train the policies to find a locally optimal student network. Our experiments show that we can achieve compression rates of more than 10x for models such as ResNet-34 while maintaining similar performance to the input teacher' network. We also present a valuable transfer learning result which shows that policies which are pre-trained on smallerteacher' networks can be used to rapidly speed up training on larger `teacher' networks.

PDF Abstract

An Analysis of Network to Network Compression via Policy Gradient Reinforcement Learning

The paper "N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning" introduces a method for compressing neural networks to enable their use in real-world applications constrained by hardware limitations. Unlike conventional model compression approaches that require manual architectural modifications or rely on predefined heuristics, this paper leverages reinforcement learning to automate the compression of deep neural networks.

Methodology

The proposed method employs a principled approach that utilizes two recurrent policy networks to achieve network compression. The first network, referred to as the 'layer removal policy network,' aggressively removes irrelevant layers from the 'teacher' network to form a coarse version of the 'student' network. The second network, known as the 'layer shrinkage policy network,' fine-tunes the size of each remaining layer in the compressed network. Together, these policy networks operate sequentially under a Markov Decision Process (MDP) model designed to optimize network architecture with respect to a reward function based on both accuracy and compression ratio.

Strong Numerical Results

The experimental findings of this paper are particularly noteworthy. The authors demonstrate that their approach can achieve compression rates exceeding 10× for models such as ResNet-34, while maintaining comparable performance to the input teacher network. In some instances, the student network even surpasses the teacher network in terms of accuracy, for example, achieving a 1.49% increase in accuracy on CIFAR-10 while obtaining significant reductions in model size.

Implications and Future Work

This paper's implications are substantial for both theoretical advancements and practical deployments in AI. The automation of model compression could significantly enhance the deployment of deep learning models on edge devices, reducing the resource footprint required without forfeiting performance. Furthermore, the demonstrated generalization capabilities of policies across different network architectures, as shown by transfer learning results, highlight the potential for further exploration of this method in broader contexts such as neural architecture search.

Future developments may explore the reward function design to ensure more sophisticated architectural evaluations, minimizing the need for extensive training epochs during policy development. Additionally, incorporating constraints beyond just model size, such as power consumption and inference time, could improve alignment with specific deployment scenarios. Moreover, extending the model to support hyperparameter optimization through reinforcement learning could open new avenues for research and application.

In conclusion, the authors have provided a robust framework for automating neural network compression using reinforcement learning, presenting an innovative solution to a critical bottleneck in model deployment. This work broadens the possibilities for deploying neural networks in resource-constrained environments and encourages further investigation into network architecture automation.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Anubhav Ashok (2 papers)
Nicholas Rhinehart (24 papers)
Fares Beainy (3 papers)
Kris M. Kitani (46 papers)

Citations (164)

View on Semantic Scholar

N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning (1709.06030v2)

An Analysis of Network to Network Compression via Policy Gradient Reinforcement Learning

Methodology

Strong Numerical Results

Implications and Future Work

Related Papers