Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dynamic Group Convolution for Accelerating Convolutional Neural Networks (2007.04242v2)

Published 8 Jul 2020 in cs.CV

Abstract: Replacing normal convolutions with group convolutions can significantly increase the computational efficiency of modern deep convolutional networks, which has been widely adopted in compact network architecture designs. However, existing group convolutions undermine the original network structures by cutting off some connections permanently resulting in significant accuracy degradation. In this paper, we propose dynamic group convolution (DGC) that adaptively selects which part of input channels to be connected within each group for individual samples on the fly. Specifically, we equip each group with a small feature selector to automatically select the most important input channels conditioned on the input images. Multiple groups can adaptively capture abundant and complementary visual/semantic features for each input image. The DGC preserves the original network structure and has similar computational efficiency as the conventional group convolution simultaneously. Extensive experiments on multiple image classification benchmarks including CIFAR-10, CIFAR-100 and ImageNet demonstrate its superiority over the existing group convolution techniques and dynamic execution methods. The code is available at https://github.com/zhuogege1943/dgc.

Citations (38)

Summary

  • The paper introduces Dynamic Group Convolution (DGC) to adaptively select input channels based on image features, enhancing efficiency without sacrificing accuracy.
  • The technique reduces computational load significantly, achieving a 2.04x saving and a 0.37% reduction in Top-1 error on ImageNet with ResNet-18.
  • DGC integrates self-attention within group convolutions, enabling seamless deployment in various CNN architectures for resource-limited and real-time applications.

Dynamic Group Convolution for Accelerating Convolutional Neural Networks

The paper "Dynamic Group Convolution for Accelerating Convolutional Neural Networks" introduces an innovative technique named Dynamic Group Convolution (DGC) to enhance the computational efficiency of convolutional neural networks (CNNs) while mitigating accuracy degradation. By integrating a dynamic execution mechanism with group convolutions, this approach aims to preserve the original network architecture's expressiveness without introducing excessive computation overhead.

The core premise of DGC is to dynamically select input channels to be connected within each group for each sample. Unlike static group convolutions, which permanently sever connections between certain input and output channels, DGC employs a real-time decision process, facilitated by an auxiliary feature selector within each group, to determine the importance of each channel conditioned on the input image. This results in a tailored approach for each image, where only the most relevant channels are activated, thus maintaining computational efficiency.

The method has been evaluated extensively using several benchmark datasets, including CIFAR-10, CIFAR-100, and ImageNet. Experiments demonstrate that DGC surpasses existing group convolution techniques and dynamic execution methods concerning both accuracy and efficiency. Notably, when applied to popular models such as ResNet, CondenseNet, and MobileNetV2, DGC consistently reduces computational load while either maintaining or enhancing accuracy. For instance, on ImageNet, DGC embedded into ResNet-18 yields a reduction in Top-1 error by 0.37% with a significant 2.04x computational saving compared to existing filter pruning methods.

The proposed method stands out due to several advantages. Firstly, its ability to adaptively retain network structures while maximizing channel utility based on input images is a significant leap over fixed group convolutions. Secondly, the framework incorporates a self-attention mechanism within the DGC layer, reminiscent of multi-head self-attention strategies seen in Transformer architectures, thereby facilitating the capture of diverse feature representations across different input samples. Moreover, DGC's compatibility with various CNNs allows for seamless integration and end-to-end optimization without the need for model pre-training.

From a theoretical standpoint, DGC's dynamic selection mechanism, guided by sample-specific importance criteria, might influence future CNN designs, particularly in applications demanding real-time processing on resource-limited platforms. In terms of practical implications, the ability to implement DGC in an array of existing architectures presents opportunities to deploy more accessible and power-efficient AI models in edge computing scenarios, such as mobile devices and IoT networks.

Looking ahead, further exploration into optimizing the dynamic index layers and saliency generators within DGC for real-time hardware execution could yield even greater performance improvements. Additionally, research might delve into extending DGC principles to other types of neural network architectures beyond convolutional models, potentially expanding its applicability across broader AI domains. The promising outcomes underscore DGC as a robust progression towards achieving balance between computational efficiency and model expressiveness.

Github Logo Streamline Icon: https://streamlinehq.com