CondConv: Conditionally Parameterized Convolutions for Efficient Inference (1904.04971v3)

Published 10 Apr 2019 in cs.CV, cs.AI, and cs.LG

Abstract: Convolutional layers are one of the basic building blocks of modern deep neural networks. One fundamental assumption is that convolutional kernels should be shared for all examples in a dataset. We propose conditionally parameterized convolutions (CondConv), which learn specialized convolutional kernels for each example. Replacing normal convolutions with CondConv enables us to increase the size and capacity of a network, while maintaining efficient inference. We demonstrate that scaling networks with CondConv improves the performance and inference cost trade-off of several existing convolutional neural network architectures on both classification and detection tasks. On ImageNet classification, our CondConv approach applied to EfficientNet-B0 achieves state-of-the-art performance of 78.3% accuracy with only 413M multiply-adds. Code and checkpoints for the CondConv Tensorflow layer and CondConv-EfficientNet models are available at: https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/condconv.

Authors (4)

Brandon Yang (9 papers)
Gabriel Bender (10 papers)
Quoc V. Le (128 papers)
Jiquan Ngiam (17 papers)

Citations (552)

View on Semantic Scholar

Summary

The paper introduces CondConv, a method that uses input-dependent expert combinations to compute dynamic convolution kernels and improve performance.
It leverages a learned routing function to mix multiple learned kernels, achieving higher accuracy with only a minimal increase in computational cost.
Experiments on models like MobileNet and EfficientNet demonstrate significant accuracy gains on ImageNet and COCO object detection tasks.

Conditionally Parameterized Convolutions for Efficient Inference

The paper "CondConv: Conditionally Parameterized Convolutions for Efficient Inference" introduces a novel approach to convolutional neural networks (CNNs) by challenging the traditional assumption that convolutional kernels are static across all input examples. The authors present conditionally parameterized convolutions (CondConv), which utilize specialized kernels computed as a function of the input example, significantly enhancing model capacity without a proportional increase in computational cost.

Methodology

CondConv introduces a paradigm shift by parameterizing convolutional kernels through a linear combination of learned experts. This is expressed mathematically as $(\alpha_1 W_1 + \ldots + \alpha_n W_n) * x$ , where $\alpha_i$ are input-dependent coefficients obtained through a learned routing function. This approach allows the network to tailor the convolutional operation to each input, leading to increased model complexity without the need for substantially more computational resources.

Key Experiments and Results

The authors conducted extensive evaluations on prominent architectures like MobileNetV1, MobileNetV2, MnasNet, ResNet-50, and EfficientNet, applying CondConv to ImageNet classification and COCO object detection tasks. The experimental results highlight CondConv's ability to improve accuracy with marginal increases in multiply-add operations (MADDs).

MobileNetV1: Incorporating CondConv increased performance to 73.7% top-1 accuracy with only a minimal increase in MADDs, notably outperforming the original's accuracy of 71.9%.
EfficientNet-B0: The integration of CondConv achieved a state-of-the-art 78.3% accuracy on ImageNet using 413 million MADDs, demonstrating its efficiency and efficacy over traditional scaling methods.

Implications

The methodology proposed by CondConv effectively aligns with the growing computational demands in real-time applications such as AI-driven video processing and autonomous vehicle navigation. The approach indicates a substantial potential for informative relationships across input examples, offering a path forward in scaling neural networks effectively without linear increases in computational cost.

Theoretical and Practical Contributions

CondConv highlights the computational efficiency of expert combination—a potentially pivotal factor for real-time deployments and large-scale applications. The research suggests broad implications for deep learning, stressing the need to leverage input-dependent customization to exploit latent patterns in large datasets more effectively.

Future Directions

The exploration of more complex kernel-generating functions, advanced architecture search, and application on larger datasets could uncover additional capabilities and limitations of CondConv. Ongoing refinement of the routing mechanisms may further enhance the performance benefits while maintaining efficient inference.

In conclusion, CondConv presents a substantial contribution to the development of deep learning models, pushing beyond conventional methodologies and setting a course for more efficient and capable inference engines within the constraints of contemporary computational resources. As researchers continue to investigate this promising direction, the implications for both theoretical and practical advancements in AI are significant.

PDF Markdown