Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets (2010.14819v2)

Published 28 Oct 2020 in cs.CV

Abstract: To obtain excellent deep neural architectures, a series of techniques are carefully designed in EfficientNets. The giant formula for simultaneously enlarging the resolution, depth and width provides us a Rubik's cube for neural networks. So that we can find networks with high efficiency and excellent performance by twisting the three dimensions. This paper aims to explore the twisting rules for obtaining deep neural networks with minimum model sizes and computational costs. Different from the network enlarging, we observe that resolution and depth are more important than width for tiny networks. Therefore, the original method, i.e., the compound scaling in EfficientNet is no longer suitable. To this end, we summarize a tiny formula for downsizing neural architectures through a series of smaller models derived from the EfficientNet-B0 with the FLOPs constraint. Experimental results on the ImageNet benchmark illustrate that our TinyNet performs much better than the smaller version of EfficientNets using the inversed giant formula. For instance, our TinyNet-E achieves a 59.9% Top-1 accuracy with only 24M FLOPs, which is about 1.9% higher than that of the previous best MobileNetV3 with similar computational cost. Code will be available at https://github.com/huawei-noah/ghostnet/tree/master/tinynet_pytorch, and https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/tinynet.

Authors (6)

Kai Han (184 papers)
Yunhe Wang (145 papers)
Qiulin Zhang (3 papers)
Wei Zhang (1492 papers)
Chunjing Xu (66 papers)
Tong Zhang (570 papers)

Citations (78)

View on Semantic Scholar

Summary

The paper introduces a novel scaling method for TinyNets, emphasizing resolution and depth over width under fixed computational constraints.
It employs Gaussian process regression to optimize resolution and depth, ensuring the efficient design of compact neural architectures.
The approach outperforms traditional downsized networks like EfficientNet, MobileNet, and ShuffleNet on both ImageNet-100 and ImageNet-1000 datasets.

Analysis of "Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets"

This paper introduces a methodology for constructing efficient deep neural networks, specifically targeting reduced model sizes without compromising performance. The focus lies on optimizing resolution, depth, and width—effectively treating these factors as a "Rubik’s Cube”—to derive TinyNets from baseline models like EfficientNet-B0.

Methodology Overview

The authors challenge the existing compound scaling method used in EfficientNets for enlarging models (which adjusts resolution, depth, and width uniformly) and argue that this is unsuitable for smaller, "tiny" networks. The paper emphasizes that for reduced models, resolution and depth impact performance more significantly than width. This observation forms the foundation for their proposed "tiny formula" for downscaling neural architectures.

Rethinking the Importance of Model Dimensions:
- The paper evaluates the influence of resolution, depth, and width under fixed computational constraints (measured by FLOPs).
- Empirical results from random models demonstrate that resolution, followed by depth, holds more importance than width for small models.
Tiny Formula Development:
- The research suggests optimizing resolution and depth first by leveraging Gaussian process regression on a dataset of models with varying FLOPs.
- Once optimal resolution and depth are determined, the width is adjusted to comply with computational constraints.
Implementation and Evaluation:
- The paper evaluates its approach using the ImageNet-100 and ImageNet-1000 datasets, comparing results against standard practices and other small CNN architectures.
- Newly derived TinyNets consistently outperformed reduced versions of EfficientNets and other small models like MobileNet and ShuffleNet.

Key Results

The findings reveal that models configured using the proposed method often exceed the performance of traditionally scaled-down networks:

TinyNet-B on ImageNet-100 exhibits higher accuracy than models derived from the standard EfficientNet method across equivalent FLOP levels.
On ImageNet-1000, TinyNet-E demonstrates superior performance compared to MobileNetV3 Small, despite utilizing comparable computational resources.

The results underscore the effectiveness of the approach in not only maintaining but sometimes enhancing performance while reducing model size considerably.

Implications and Future Work

The implications of this research are significant for deploying deep learning models in resource-constrained environments, such as mobile and embedded systems. Optimizing architectures to achieve better trade-offs between computational efficiency and performance could drive advancements in various applications, including real-time image processing and edge computing.

The paper also opens avenues for future exploration in AI research. One potential direction is the adaptation of this methodology for other network architectures, expanding its general applicability. Furthermore, advances in automated model scaling could benefit from integrating the concepts introduced in this paper, enhancing the efficiency and adaptability of neural architectures.

In conclusion, this paper effectively contributes to the ongoing discourse in neural network optimization, providing a robust framework for designing compact yet powerful models. The structured evaluation and empirical evidence presented lay a solid groundwork for further exploration and potential adoption in practical deep learning implementations.

PDF Markdown