FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search (1812.03443v3)

Published 9 Dec 2018 in cs.CV

Abstract: Designing accurate and efficient ConvNets for mobile devices is challenging because the design space is combinatorially large. Due to this, previous neural architecture search (NAS) methods are computationally expensive. ConvNet architecture optimality depends on factors such as input resolution and target devices. However, existing approaches are too expensive for case-by-case redesigns. Also, previous work focuses primarily on reducing FLOPs, but FLOP count does not always reflect actual latency. To address these, we propose a differentiable neural architecture search (DNAS) framework that uses gradient-based methods to optimize ConvNet architectures, avoiding enumerating and training individual architectures separately as in previous methods. FBNets, a family of models discovered by DNAS surpass state-of-the-art models both designed manually and generated automatically. FBNet-B achieves 74.1% top-1 accuracy on ImageNet with 295M FLOPs and 23.1 ms latency on a Samsung S8 phone, 2.4x smaller and 1.5x faster than MobileNetV2-1.3 with similar accuracy. Despite higher accuracy and lower latency than MnasNet, we estimate FBNet-B's search cost is 420x smaller than MnasNet's, at only 216 GPU-hours. Searched for different resolutions and channel sizes, FBNets achieve 1.5% to 6.4% higher accuracy than MobileNetV2. The smallest FBNet achieves 50.2% accuracy and 2.9 ms latency (345 frames per second) on a Samsung S8. Over a Samsung-optimized FBNet, the iPhone-X-optimized model achieves a 1.4x speedup on an iPhone X.

PDF Abstract

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

Summary

The paper "FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search" addresses the challenges in designing efficient convolutional neural networks (ConvNets) optimized for mobile devices. The proposed method, Differentiable Neural Architecture Search (DNAS), leverages gradient-based optimization to explore a layer-wise search space for optimal ConvNet architectures. Unlike previous methods that rely on reinforcement learning (RL) and have high computational costs, DNAS is significantly faster and more efficient.

Key Contributions

Differentiable Neural Architecture Search (DNAS) Framework: The cornerstone of the paper is the DNAS framework, which allows neural architecture search to be performed using gradient descent. This framework avoids the prohibitive computational costs associated with RL-based NAS methods.
Hardware-Aware Optimization: The DNAS framework directly incorporates hardware latency into the loss function, ensuring that the discovered architectures are optimized not just for accuracy but also for efficiency on specific target devices.
Layer-Wise Search Space: The search space allows different layers to choose different blocks, improving both the efficiency and accuracy of the ConvNet models. This is in contrast to cell-based search spaces used in prior works, which repeat the same cell structure across all layers.
Latency Estimation via Lookup Table Model: To address the computational complexity of estimating the latency for a vast number of possible architectures, a lookup table model is used. This model significantly reduces the estimation cost by measuring the latency of each operator once and then using those measurements to estimate the overall network latency.

Numerical Results

The paper presents several FBNets (Facebook-Berkeley-Nets) discovered using the DNAS framework, demonstrating remarkable efficiency and accuracy. Notable results include:

FBNet-B achieves a top-1 accuracy of 74.1% on ImageNet with 295M FLOPs and 23.1 ms latency on a Samsung S8, which is 2.4x smaller and 1.5x faster than MobileNetV2-1.3 with similar accuracy.
FBNet-C achieves 74.9% accuracy with 5.5M parameters and a 28.1 ms latency, outperforming other efficient models like ShuffleNetV2 and MnasNet.

The paper also highlights the significant reduction in search costs:

FBNet-B's search cost is only 216 GPU-hours, which is 420x smaller than that of MnasNet (estimated at 91,000 GPU-hours).

Implications and Future Development

The results of this paper have both practical and theoretical implications. Practically, the efficiency of the DNAS framework enables frequent and context-specific redesigns of ConvNets, which is especially beneficial for mobile and embedded applications where computational resources and power efficiency are paramount. Theoretically, the approach underscores the potential of incorporating hardware-specific constraints directly into the optimization process for network architecture.

Future developments in AI could leverage the principles of DNAS to explore other types of neural networks beyond ConvNets, such as recurrent neural networks (RNNs) or transformers, particularly for applications requiring real-time processing on edge devices. Additionally, advancements could include more complex latency models that account for varying hardware-software co-dependencies or even energy consumption metrics.

By integrating gradient-based methods with hardware-aware constraints, the DNAS framework presents an efficient and scalable solution for neural architecture search, paving the way for more adaptable and efficient neural network designs in the field of AI research.

PDF Markdown Bookmark Chat (Pro)

Authors (10)

Bichen Wu (52 papers)
Xiaoliang Dai (44 papers)
Peizhao Zhang (40 papers)
Yanghan Wang (4 papers)
Fei Sun (151 papers)
Yiming Wu (30 papers)
Yuandong Tian (128 papers)
Peter Vajda (52 papers)
Yangqing Jia (17 papers)
Kurt Keutzer (199 papers)

Citations (1,241)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos