LeanResNet: A Low-cost Yet Effective Convolutional Residual Networks

Published 15 Apr 2019 in cs.LG and stat.ML | (1904.06952v2)

Abstract: Convolutional Neural Networks (CNNs) filter the input data using spatial convolution operators with compact stencils. Commonly, the convolution operators couple features from all channels, which leads to immense computational cost in the training of and prediction with CNNs. To improve the efficiency of CNNs, we introduce lean convolution operators that reduce the number of parameters and computational complexity, and can be used in a wide range of existing CNNs. Here, we exemplify their use in residual networks (ResNets), which have been very reliable for a few years now and analyzed intensively. In our experiments on three image classification problems, the proposed LeanResNet yields results that are comparable to other recently proposed reduced architectures using similar number of parameters.

Abstract PDF Upgrade to Chat

Citations (4)

View on Semantic Scholar

Summary

The paper proposes a novel lean convolution operator that reduces parameters and computational cost in ResNet architectures.
It integrates depth-wise and point-wise convolutions with a unique 4-point stencil to streamline network design.
Experimental results on CIFAR-10, CIFAR-100, and STL-10 show competitive accuracy with significantly lower resource demand.

"LeanResNet: A Low-cost Yet Effective Convolutional Residual Networks" (1904.06952)

Introduction

The paper introduces a novel approach to enhance the efficiency of Convolutional Neural Networks (CNNs) by proposing a lean convolution operator designed to decrease both the parameter count and computational demands without significantly altering existing architectures. This approach is exemplified through its application in ResNets, a widely used and robust architecture known for its reliability in image classification tasks. The research presents a method to balance computational efficiency with performance, offering a promising pathway for deploying CNNs in scenarios with limited computational resources.

Lean Convolution Operators

The proposed LeanResNet architecture leverages innovative convolution operators that blend depth-wise and point-wise (1x1) convolutions in a linear combination. This design enables simultaneous execution on hardware, optimizing memory access and computational load. Unlike traditional CNN architectures that rely on fully coupled depth-wise convolutions which significantly increase the parameter count, the lean convolution operator streamlines network design. Moreover, it incorporates a 4-point stencil, contrasting with the conventional 3x3 stencil, further enhancing computational efficiency by reducing floating-point operations (FLOPs) and memory accesses.

Mathematical Formulation

In LeanResNet, a typical ResNet structure is modified by replacing standard convolutions with lean convolutions. The forward propagation in this network maintains the ResNet identity mapping, augmented by specialized linear operators that use fewer parameters. The convolution operator, as parameterized, allows each channel's interactions to be restricted efficiently while still permitting cross-channel features through point-wise operations. This creates a scenario where computational performance is significantly improved without drastically changing established network structures.

Experimental Evaluation

Experiments conducted on benchmark datasets such as CIFAR-10, CIFAR-100, and STL-10 demonstrate that LeanResNet achieves competitive accuracy levels with significantly fewer parameters compared to other networks such as MobileNetV2 and ShuffleNetV2. For instance, LeanResNet demonstrated comparable performance to traditional ResNets, despite employing fewer resources. These results underscore the effectiveness of the lean convolution operators in maintaining high classification accuracy while reducing model size and computational demands.

Computational Performance

The paper reports benchmarks of computational costs for various convolution configurations on a NVIDIA GeForce 1080Ti GPU, revealing that the lean convolution implementation provides faster execution times than conventional combinations of point-wise and depth-wise convolutions. Notably, as channel dimensions increase—a common scenario in deep networks—the computational advantages of the lean convolutions become more pronounced, highlighting their potential for scaling in large networks.

Conclusion

LeanResNet effectively addresses the growing demand for computationally efficient CNN architectures capable of deployment on devices with constrained resources. By incorporating novel convolution designs within a familiar ResNet framework, the architecture offers a robust alternative for practitioners seeking to optimize both performance and resource utilization. Additionally, the potential application in multidimensional and time-series data processing opens avenues for further research, particularly in fields requiring real-time data analysis with limited computational capacity. This work suggests a progressive direction for future studies aiming to refine CNN architectures for diverse and resource-sensitive applications.

Markdown