Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm (1711.02017v3)

Published 6 Nov 2017 in cs.NE, cs.AI, and cs.CV

Abstract: Deep neural networks (DNNs) have begun to have a pervasive impact on various applications of machine learning. However, the problem of finding an optimal DNN architecture for large applications is challenging. Common approaches go for deeper and larger DNN architectures but may incur substantial redundancy. To address these problems, we introduce a network growth algorithm that complements network pruning to learn both weights and compact DNN architectures during training. We propose a DNN synthesis tool (NeST) that combines both methods to automate the generation of compact and accurate DNNs. NeST starts with a randomly initialized sparse network called the seed architecture. It iteratively tunes the architecture with gradient-based growth and magnitude-based pruning of neurons and connections. Our experimental results show that NeST yields accurate, yet very compact DNNs, with a wide range of seed architecture selection. For the LeNet-300-100 (LeNet-5) architecture, we reduce network parameters by 70.2x (74.3x) and floating-point operations (FLOPs) by 79.4x (43.7x). For the AlexNet and VGG-16 architectures, we reduce network parameters (FLOPs) by 15.7x (4.6x) and 30.2x (8.6x), respectively. NeST's grow-and-prune paradigm delivers significant additional parameter and FLOPs reduction relative to pruning-only methods.

Citations (226)

Summary

  • The paper introduces a grow-and-prune paradigm that starts with a sparse seed and dynamically enhances the network using gradient-based growth and magnitude-based pruning.
  • The paper achieves significant reductions in parameters and FLOPs, with models like LeNet-5 experiencing up to 74.3× fewer parameters and 43.7× reduced FLOPs without sacrificing accuracy.
  • The paper demonstrates that NeST overcomes reliance on human-designed architectures, offering an automated synthesis approach ideal for resource-constrained applications.

NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm

The paper presents NeST, a neural network synthesis tool designed to optimize deep neural network (DNN) architectures by utilizing a novel grow-and-prune paradigm. This approach specifically addresses the challenges associated with the derivation of optimal DNN architectures for large-scale applications. Conventional methods tend to develop increasingly deeper and larger networks, often resulting in significant redundancy. NeST aims to alleviate this issue by combining network growth and pruning into a coherent workflow that automates the generation of compact and high-performing DNNs.

The methodology behind NeST is inspired by the dynamic nature of synaptic connections in the human brain, where the number of connections first increases and then decreases over time. NeST begins with a sparse initial architecture, referred to as the "seed architecture," and iteratively enhances it. The growth phase involves adding neurons and connections based on gradient information to improve accuracy. This process is followed by a pruning phase that removes redundant elements identified by their magnitude, allowing the network to maintain compactness without compromising accuracy.

Experimental results demonstrate NeST's efficacy in synthesizing both the LeNet and AlexNet architectures. Specifically, NeST achieved a significant reduction in the number of parameters and floating-point operations (FLOPs) with no accuracy sacrifice: LeNet-300-100 and LeNet-5 saw parameter reductions of 70.2× and 74.3×, respectively, alongside reduced FLOPs of 79.4× and 43.7×. Similarly, for AlexNet and VGG-16, NeST reduced parameters (FLOPs) by 15.7× (4.6×) and 30.2× (8.6×), respectively, all while maintaining performance levels on par with baseline models.

Critically, NeST diverges from traditional DNN design paradigms by starting from a potentially arbitrary and sparse seed architecture, overcoming the reliance on human intuition for selecting baseline architectures. The grow-and-prune strategy fosters significant reductions in redundant components, leading to lightweight DNN models.

Theoretical implications of this research highlight an enhanced understanding of neural architecture search (NAS) and architecture design space. Practically, NeST promises substantial improvements in efficiency for deploying DNNs, particularly in resource-constrained environments such as mobile or edge devices, where computational capabilities and memory are limited.

Future research endeavors may explore extending NeST's application to more complex architectures like ResNet or DenseNet, addressing the associated temporospatial overheads in training large models within the grow-and-prune paradigm without utilizing memory-intensive masks.

In conclusion, NeST represents a notable advancement in automated DNN synthesis, marrying the biological inspiration of synaptic pruning with computational efficiency. Its contributions to achieving compact and efficient DNNs open avenues for deployment in a broader range of applications without performance compromises.