Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Slimmable Neural Networks (1812.08928v1)

Published 21 Dec 2018 in cs.CV and cs.AI

Abstract: We present a simple and general method to train a single neural network executable at different widths (number of channels in a layer), permitting instant and adaptive accuracy-efficiency trade-offs at runtime. Instead of training individual networks with different width configurations, we train a shared network with switchable batch normalization. At runtime, the network can adjust its width on the fly according to on-device benchmarks and resource constraints, rather than downloading and offloading different models. Our trained networks, named slimmable neural networks, achieve similar (and in many cases better) ImageNet classification accuracy than individually trained models of MobileNet v1, MobileNet v2, ShuffleNet and ResNet-50 at different widths respectively. We also demonstrate better performance of slimmable models compared with individual ones across a wide range of applications including COCO bounding-box object detection, instance segmentation and person keypoint detection without tuning hyper-parameters. Lastly we visualize and discuss the learned features of slimmable networks. Code and models are available at: https://github.com/JiahuiYu/slimmable_networks

Citations (525)

Summary

  • The paper presents a novel method allowing a single network to adjust its width dynamically for runtime efficiency and accuracy trade-offs.
  • It employs Switchable Batch Normalization to stabilize feature normalization across configurations, ensuring competitive ImageNet performance.
  • Extensive experiments show slimmable networks outperform individually trained models in tasks like object detection and segmentation, reducing resource overhead.

Slimmable Neural Networks: A Comprehensive Overview

The paper "Slimmable Neural Networks" presents a novel method for training a single neural network that can dynamically adjust to different widths, enabling instant and adaptive accuracy-efficiency trade-offs at runtime. The proposed system introduces the concept of slimmable neural networks by leveraging switchable batch normalization, which permits a network to alter its width according to various device constraints without the need to deploy multiple models.

Highlights and Methodology

The central innovation of this research is its ability to train a network with switchable configurations by standardizing the training process with independent batch normalization for each width configuration. This approach, referred to as Switchable Batch Normalization (S-BN), addresses the challenges associated with feature mean and variance discrepancies caused by varying channel numbers across different network layers. The independence of S-BN parameters for each switch ensures robust feature normalization and overall stability of the training process.

The slimmable networks demonstrate competitive performance across various computational models such as MobileNet v1, MobileNet v2, ShuffleNet, and ResNet-50, achieving comparable or superior accuracy in ImageNet classification tasks relative to individually trained models. Notably, slimmable models provide flexible responsiveness spanning a range of application tasks — from object detection to segmentation and keypoint detection — without the need for further hyper-parameter tuning during deployment.

Experimental Results

The paper presents extensive experimental evidence to support the effectiveness of slimmable neural networks. Key results include:

  • ImageNet Classification: Slimmable networks are shown to achieve comparable or improved top-1 error rates compared to individually trained models. For instance, a slimmable MobileNet v1 with a 0.25x configuration demonstrates an accuracy improvement of 3.3% over its individually trained counterpart.
  • Expandability: The scalability is highlighted by training a model with eight switches, showing negligible accuracy drop compared to models with fewer configurations, thus demonstrating effective resource adaptation.
  • Application Versatility: In object detection tasks on the COCO dataset, slimmable networks consistently outperform individually trained models, indicating their effectiveness in real-world deployment scenarios.

Practical and Theoretical Implications

From a practical standpoint, the proposed methodology offers a significant reduction in resource overhead by eliminating the need for multiple separately trained models for varying device capabilities. This efficiency is critical in edge computing environments where diverse hardware constraints exist. By exploiting a shared network with switchable configurations, slimmable neural networks facilitate optimal performance tailored to specific operational environments, reducing energy consumption, and enhancing computational adaptability.

Theoretically, this research opens avenues for further exploration into dynamic neural architectures. The ability to seamlessly transition across different model configurations without compromising performance quality emphasizes the potential for advanced adaptive systems in artificial intelligence. Future work could explore integration with other adaptive computation frameworks or extend this approach to unsupervised learning and reinforcement learning domains.

Conclusion

The paper offers a significant contribution to the field of neural network design, presenting a flexible and efficient method for optimizing neural network performance across varying computational environments. The introduction of Switchable Batch Normalization provides a robust mechanism for managing feature normalization across network configurations while preserving or enhancing overall model accuracy. This research establishes a foundational methodology that could be pivotal in advancing the development of more responsive and resource-efficient AI models in diverse application scenarios. The widespread applicability and potential integrations of slimmable networks signify notable progress in the adaptation strategies of neural networks.

Github Logo Streamline Icon: https://streamlinehq.com