Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RepVGG: Making VGG-style ConvNets Great Again (2101.03697v3)

Published 11 Jan 2021 in cs.CV, cs.AI, and cs.LG

Abstract: We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology. Such decoupling of the training-time and inference-time architecture is realized by a structural re-parameterization technique so that the model is named RepVGG. On ImageNet, RepVGG reaches over 80% top-1 accuracy, which is the first time for a plain model, to the best of our knowledge. On NVIDIA 1080Ti GPU, RepVGG models run 83% faster than ResNet-50 or 101% faster than ResNet-101 with higher accuracy and show favorable accuracy-speed trade-off compared to the state-of-the-art models like EfficientNet and RegNet. The code and trained models are available at https://github.com/megvii-model/RepVGG.

Citations (1,341)

Summary

  • The paper presents a structural re-parameterization approach that merges multi-branch training into a single, efficient inference pathway.
  • The paper demonstrates that RepVGG variants, such as RepVGG-A0 and A1, achieve up to 74.46% top-1 accuracy while running significantly faster than comparable ResNets.
  • The paper highlights the potential for extending re-parameterization techniques to design more efficient ConvNets and optimize hardware-specific deployment.

RepVGG: Making VGG-style ConvNets Great Again

The paper "RepVGG: Making VGG-style ConvNets Great Again" by Xiaohan Ding et al. presents a new convolutional neural network (ConvNet) architecture designed to optimize the trade-off between performance and computational efficiency. Unlike many recent, complex neural network architectures which prioritize high performance at the cost of increased complexity and reduced speed, RepVGG takes inspiration from the simplicity of the classic VGG network. This paper introduces a structural re-parameterization approach that enables a straightforward feed-forward model during inference while leveraging a multi-branch topology during training.

Structural Re-parameterization

Key to RepVGG's success is the decoupling of the training-time and inference-time architectures. During training, the network benefits from a multi-branch topology comprising both 3×33\times3 convolutions and identity and 1×11\times1 branches. These branches facilitate efficient training by helping to mitigate the gradient vanishing problem. At inference time, these multiple branches are re-parameterized into a single, efficient 3×33\times3 convolutional layer. This transformation is achieved via straightforward algebraic manipulation of the network parameters.

Performance and Efficiency

RepVGG demonstrates its unique balance between speed and accuracy through extensive evaluations on the ImageNet dataset. Notably, several configurations of RepVGG (RepVGG-A0, A1, A2, etc.) outperform corresponding versions of ResNet in terms of both top-1 accuracy and speed, measured in examples per second on an NVIDIA 1080Ti GPU. RepVGG-A0, for instance, achieves a top-1 accuracy of 72.41\% while running 33\% faster than ResNet-18, and RepVGG-A1 achieves 74.46\% accuracy while being 64\% faster than ResNet-34.

The paper also discusses the efficacy of using structural re-parameterization over other methods like DiracNet and post-addition batch normalization. It finds that the combination of multiple branches during training, including identity and 1×11\times1 convolutions with preceding batch normalization, is critical to achieving high performance.

Implications

The implications of this research are substantial for both practical usage and theoretical advancements in artificial intelligence. Practically, RepVGG offers a highly efficient model for deployment in GPU environments and specialized hardware, thus suiting applications requiring high inference speeds. Theoretically, this research underscores the potential of decoupling training-time and inference-time architectures to achieve better trade-offs between complexity, speed, and performance.

Future Developments

Future research could potentially explore several avenues opened up by RepVGG. Firstly, the exploration of other types of architectural re-parameterization could yield even more efficient ConvNets. Enhancing the structural re-parameterization techniques and extending them to other types of layers or architectures may bring about additional improvements. Secondly, investigating the impact of structural re-parameterization in neural architecture search (NAS) frameworks could automate the discovery of even more optimal configurations. Lastly, as specialized hardware for AI inference continues to evolve, RepVGG provides a compelling case for designing hardware that capitalizes on its simple, highly parallelizable structure.

In summary, RepVGG reimagines the classic VGG-style architecture through innovative structural re-parameterization, achieving a finely-tuned balance between computational efficiency and model performance. This work presents a notable advancement, particularly for deployment scenarios where inference speed is of paramount importance.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

Youtube Logo Streamline Icon: https://streamlinehq.com