SlimSAM: 0.1% Data Makes Segment Anything Slim (2312.05284v4)

Published 8 Dec 2023 in cs.CV

Abstract: Current approaches for compressing the Segment Anything Model (SAM) yield commendable results, yet necessitate extensive data to train a new network from scratch. Employing conventional pruning techniques can remarkably reduce data requirements but would suffer from a degradation in performance. To address this challenging trade-off, we introduce SlimSAM, a novel data-efficient SAM compression method that achieves superior performance with extremely less training data. The essence of SlimSAM is encapsulated in the alternate slimming framework which effectively enhances knowledge inheritance under severely limited training data availability and exceptional pruning ratio. Diverging from prior techniques, our framework progressively compresses the model by alternately pruning and distilling distinct, decoupled sub-structures. Disturbed Taylor pruning is also proposed to address the misalignment between the pruning objective and training target, thereby boosting the post-distillation after pruning. SlimSAM yields significant performance improvements while demanding over 10 times less training data than any other existing compression methods. Even when compared to the original SAM, SlimSAM achieves approaching performance while reducing parameter counts to merely 1.4% (9.1M), MACs to 0.8% (23G), and requiring only 0.1% (10k) of the SAM training data. The code is available at http://github.com/czg1225/SlimSAM.

References (47)

Citations (6)

View on Semantic Scholar

Summary

The paper introduces SlimSAM, a framework that compresses the Segment Anything Model using only 0.1% of the original training data.
It employs an alternate slimming framework with disturbed Taylor pruning to balance model size reduction and segmentation performance.
Experimental results show significant parameter and computational reductions, enabling deployment on resource-constrained devices.

SlimSAM: A Data-Efficient Approach to Compression of Segment Anything Model

The paper, titled "SlimSAM: 0.1% Data Makes Segment Anything Slim," introduces an innovative approach to compress the Segment Anything Model (SAM) with significantly reduced training data. The work addresses the challenge of maintaining model performance while minimizing computational and data requirements, offering a practical solution for deploying SAM on resource-limited devices.

Key Contributions

The central contribution of this work is the SlimSAM framework, which significantly reduces the training data needed—only 0.1% of the original SAM dataset. The authors propose an alternate slimming framework that alternates between pruning and distillation, effectively managing the trade-off between model size and performance. This approach is augmented by a novel pruning method called disturbed Taylor pruning, which aligns pruning objectives with training targets to enhance post-distillation recovery.

Technical Approach

Alternate Slimming Framework: The framework divides the SAM structure into decoupled components—embedding and bottleneck dimensions—and applies pruning sequentially. This process aids in minimizing the deviation from the original model and allows for efficient intermediate feature alignment.
Disturbed Taylor Pruning: This method estimates parameter importance based on soft label divergence. By introducing Gaussian noise, the framework generates non-zero gradients, facilitating effective importance estimation without hard labels. This refinement resolves the traditional misalignment between pruning objectives and training targets.

Experimental Results

SlimSAM demonstrates remarkable improvements over existing methodologies, achieving significant reductions in both parameter count and computational requirements. Notably, SlimSAM reduces the original SAM parameters to 1.4% (9.1M) and MACs to 0.8% (23G), while requiring only 10k unlabelled images for training—over ten times less data compared to other compression techniques.

Comparative Analysis

The results highlight SlimSAM's superior performance and efficiency. When tested against other SAM compression methods, SlimSAM consistently yields higher Mean Intersection over Union (MIoU) scores, particularly under severe data constraints. Its performance surpasses existing models like FastSAM, MobileSAM, and EfficientSAM, which require much more substantial datasets for training.

Implications and Future Work

The implications of this research are substantial for the deployment of SAM-based models in environments where data and computational resources are constrained. The SlimSAM approach not only preserves the robust segmentation capabilities of SAM but also opens avenues for its application on edge devices.

Future work could explore extending the SlimSAM framework to other large-scale models and further refining the pruning-distillation process to achieve even greater efficiency. Moreover, investigating the potential of SlimSAM in various real-world scenarios could provide additional insights into its practical applications.

Conclusion

This paper presents a significant advancement in model compression, enabling the widespread applicability of SAM through an innovative, data-efficient method. The combination of alternate slimming and disturbed Taylor pruning offers a powerful toolkit for not only compressing large models but also ensuring their performance remains intact with minimal training resources.

PDF Markdown

Related Papers

GitHub

GitHub - czg1225/SlimSAM: SlimSAM: 0.1% Data Makes Segment Anything Slim (314 stars)
GitHub - czg1225/SlimSAM: SlimSAM: 0.1% Data Makes Segment Anything Slim (314 stars)

Tweets

https://twitter.com/36723/status/1734827591008170044