Rethinking Channel Dimensions for Efficient Model Design (2007.00992v3)

Published 2 Jul 2020 in cs.CV

Abstract: Designing an efficient model within the limited computational cost is challenging. We argue the accuracy of a lightweight model has been further limited by the design convention: a stage-wise configuration of the channel dimensions, which looks like a piecewise linear function of the network stage. In this paper, we study an effective channel dimension configuration towards better performance than the convention. To this end, we empirically study how to design a single layer properly by analyzing the rank of the output feature. We then investigate the channel configuration of a model by searching network architectures concerning the channel configuration under the computational cost restriction. Based on the investigation, we propose a simple yet effective channel configuration that can be parameterized by the layer index. As a result, our proposed model following the channel parameterization achieves remarkable performance on ImageNet classification and transfer learning tasks including COCO object detection, COCO instance segmentation, and fine-grained classifications. Code and ImageNet pretrained models are available at https://github.com/clovaai/rexnet.

Citations (76)

View on Semantic Scholar

Summary

The paper introduces a methodology to assess layer expressiveness via matrix rank, guiding optimal expansion layer design.
It proposes a novel linear channel configuration search that improves accuracy while adhering to specific computational limits.
The new architecture outperforms state-of-the-art models like EfficientNet-B0 on ImageNet and transfer tasks, demonstrating scalability and robustness.

Rethinking Channel Dimensions for Efficient Model Design

The paper "Rethinking Channel Dimensions for Efficient Model Design" critically examines the prevalent stage-wise channel dimension configurations in lightweight models and proposes a novel configuration strategy to enhance performance within computational constraints. Authored by researchers from the NAVER AI Lab, the work challenges conventional approaches rooted in the MobileNetV2 design and further adopted by subsequent network architecture search (NAS) methodologies.

Methodology and Contributions

The authors argue that traditional configurations prioritize computational efficiency, often at the expense of model expressiveness. Their study introduces a methodical investigation into channel dimension configurations, relying on an empirical analysis of the rank of output features to gauge expressiveness. Key contributions include:

Layer Design Study: The study investigates the expressiveness of individual layers using the matrix rank of output features. It establishes guidelines for designing expansion layers with optimal expressiveness by adjusting the channel dimension ratios.
Channel Configuration Search: The research further explores entire network architectures, optimizing channel configurations to improve accuracy under specific computational limits. This involves parameterization of channel dimensions as linear functions of the layer index, a departure from piecewise linear approaches.
Introduction of New Architecture: The paper proposes a model based on the discovered channel parameterization, achieving significant accuracy improvements on ImageNet classification and transfer learning tasks such as COCO detection and segmentation, and fine-grained classification.

Results

The proposed method demonstrates strong numerical results, significantly outperforming state-of-the-art models including EfficientNet-B0 in ImageNet classification, both with and without additional training techniques like AutoAugment or RandAug. This highlights the effectiveness of the linear parameterization strategy over the conventional configurations.

In extensive experiments, ReXNets achieve notable accuracy gains across a range of settings, demonstrating their versatility and robustness. Moreover, the models exhibit improved scalability compared to EfficientNets, offering better computational efficiency on both CPU and GPU platforms.

Implications and Future Directions

The findings suggest a shift in model design principles for lightweight networks, emphasizing the importance of revisiting channel dimension strategies for enhanced model performance. The linear parameterization approach provides a more effective configuration, challenging existing conventions.

For future research, integrating the proposed channel configurations into NAS frameworks may yield further performance improvements. Investigating the interplay between channel configurations and other architectural components, such as attention mechanisms or advanced nonlinearities, could offer additional insights for optimizing model expressiveness.

The implications are significant for AI fields demanding efficient, high-performance models, indicating a strategic yet straightforward modification can lead to substantial advancements.

In conclusion, this work offers a valuable contribution to efficient network design, urging a reevaluation of entrenched design paradigms. This paper establishes a foundation for further exploration into channel configurations, potentially influencing the direction of future network architecture developments.