Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes (2110.08059v3)

Published 15 Oct 2021 in cs.CV and cs.LG

Abstract: When designing Convolutional Neural Networks (CNNs), one must select the size\break of the convolutional kernels before training. Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice. A more efficient approach is to learn the kernel size during training. However, existing works that learn the kernel size have a limited bandwidth. These approaches scale kernels by dilation, and thus the detail they can describe is limited. In this work, we propose FlexConv, a novel convolutional operation with which high bandwidth convolutional kernels of learnable kernel size can be learned at a fixed parameter cost. FlexNets model long-term dependencies without the use of pooling, achieve state-of-the-art performance on several sequential datasets, outperform recent works with learned kernel sizes, and are competitive with much deeper ResNets on image benchmark datasets. Additionally, FlexNets can be deployed at higher resolutions than those seen during training. To avoid aliasing, we propose a novel kernel parameterization with which the frequency of the kernels can be analytically controlled. Our novel kernel parameterization shows higher descriptive power and faster convergence speed than existing parameterizations. This leads to important improvements in classification accuracy.

Citations (81)

Summary

  • The paper presents a flexible convolutional operation that dynamically learns kernel sizes through a small neural network, simplifying architecture design.
  • It deploys FlexConv at higher resolutions by adjusting kernel sampling rates, achieving state-of-the-art results on benchmarks like CIFAR-10 and sMNIST.
  • MAGNet is used to construct alias-free kernels by analytically controlling frequency spectra, enhancing the efficiency and generalization of CNN architectures.

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

The paper presents an innovative concept called FlexConv, which is a novel convolutional operation designed to learn high bandwidth convolutional kernels of variable size at a fixed parameter cost. This approach addresses the inefficiencies of traditional convolutional neural networks (CNNs) where kernel sizes need to be specified prior to training. FlexConv introduces a differentiable approach that allows learning the kernel size dynamically during training.

Core Contributions

  1. Flexible Convolutional Operation: FlexConv is developed as a flexible size continuous kernel convolution, which uses a small neural network to parameterize continuous convolutional kernels. This method enables the modeling of continuous functions of varying size with a constant parameter budget.
  2. Deployment at Higher Resolutions: The flexibility of FlexConv allows it to be deployed at higher resolutions than those observed during training by adjusting the sampling rate of the kernel indices.
  3. Avoiding Aliasing with MAGNet: The paper introduces Multiplicative Anisotropic Gabor Networks (MAGNets), an enhanced class of Multiplicative Filter Networks, to construct alias-free convolutional kernels. This is achieved by analytically controlling the frequency spectrum of the generated kernels to regularize FlexConv against aliasing.
  4. Improved Performance Metrics: The proposed FlexNets, CNN architectures utilizing FlexConv, demonstrate state-of-the-art results on sequential datasets and remain competitive with deeper models like ResNets on image benchmarks. FlexNets outperform traditional methods while being more parameter and compute efficient.

Numerical Results and Comparisons

  • FlexNets achieve superior performance on datasets such as sMNIST, pMNIST, sCIFAR10, and noise-padded CIFAR10.
  • On CIFAR-10, FlexNets are competitive with deeper ResNet architectures, showing notable improvements in various configurations.
  • Experiments with sequential and time-series data such as CharacterTrajectories and SpeechCommands further showcase the effectiveness of FlexConv in capturing long-term dependencies.

Implications and Future Research

The introduction of FlexConv has several practical and theoretical implications:

  • Efficiency in Architecture Design: By enabling kernel size learning during training, FlexConv reduces the complexity associated with architecture search, specifically kernel size determination.
  • Generalization Across Resolutions: By mitigating aliasing effects, FlexNets provide a foundation for training on lower resolutions and inference on higher resolutions with minimal accuracy loss.
  • Potential for Cross-Task Applicability: The flexibility inherent in FlexConv could signify a step towards versatile models adaptable across diverse tasks without extensive redesign.

Future directions could include further optimizing the computational demands of FlexConv for large-scale applications and exploring the integration of FlexConv with other neural network paradigms such as attention mechanisms. Additionally, leveraging the alias-free generalization capability of FlexNets offers intriguing possibilities for seamless transfer learning across resolutions and datasets.

X Twitter Logo Streamline Icon: https://streamlinehq.com