Contrastive Representation Distillation via Multi-Scale Feature Decoupling (2502.05835v2)

Published 9 Feb 2025 in cs.CV and cs.AI

Abstract: Knowledge distillation is a technique aimed at enhancing the performance of a small student network without increasing its parameter size by transferring knowledge from a large, pre-trained teacher network. In the feature space, different local regions within an individual global feature map often encode distinct yet interdependent semantic information. However, previous methods mainly focus on transferring global feature knowledge, neglecting the decoupling of interdependent local regions within an individual global feature, which often results in suboptimal performance. To address this limitation, we propose MSDCRD, a novel contrastive representation distillation approach that explicitly performs multi-scale decoupling within the feature space. MSDCRD employs a multi-scale sliding-window pooling approach within the feature space to capture representations at various granularities effectively. This, in conjunction with sample categorization, facilitates efficient multi-scale feature decoupling. When integrated with a novel and effective contrastive loss function, this forms the core of MSDCRD. Feature representations differ significantly across network architectures, and this divergence becomes more pronounced in heterogeneous models, rendering feature distillation particularly challenging. Despite this, our method not only achieves superior performance in homogeneous models but also enables efficient feature knowledge transfer across a variety of heterogeneous teacher-student pairs, highlighting its strong generalizability. Moreover, its plug-and-play and parameter-free nature enables flexible integration with different visual tasks. Extensive experiments on different visual benchmarks consistently confirm the superiority of our method in enhancing the performance of student models.

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Contrastive Representation Distillation via Multi-Scale Feature Decoupling (2502.05835v2)

Collections

Summary

Paper Prompts

Follow-up Questions

Authors (3)

Don't miss out on important new AI/ML research

Contrastive Representation Distillation via Multi-Scale Feature Decoupling (2502.05835v2)

Collections

Summary

Paper Prompts

Follow-up Questions

Related Papers

Authors (3)

Don't miss out on important new AI/ML research