Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Complementary Relation Contrastive Distillation (2103.16367v1)

Published 29 Mar 2021 in cs.CV

Abstract: Knowledge distillation aims to transfer representation ability from a teacher model to a student model. Previous approaches focus on either individual representation distillation or inter-sample similarity preservation. While we argue that the inter-sample relation conveys abundant information and needs to be distilled in a more effective way. In this paper, we propose a novel knowledge distillation method, namely Complementary Relation Contrastive Distillation (CRCD), to transfer the structural knowledge from the teacher to the student. Specifically, we estimate the mutual relation in an anchor-based way and distill the anchor-student relation under the supervision of its corresponding anchor-teacher relation. To make it more robust, mutual relations are modeled by two complementary elements: the feature and its gradient. Furthermore, the low bound of mutual information between the anchor-teacher relation distribution and the anchor-student relation distribution is maximized via relation contrastive loss, which can distill both the sample representation and the inter-sample relations. Experiments on different benchmarks demonstrate the effectiveness of our proposed CRCD.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Jinguo Zhu (20 papers)
  2. Shixiang Tang (48 papers)
  3. Dapeng Chen (33 papers)
  4. Shijie Yu (9 papers)
  5. Yakun Liu (4 papers)
  6. Aijun Yang (4 papers)
  7. Mingzhe Rong (3 papers)
  8. Xiaohua Wang (26 papers)
Citations (69)

Summary

We haven't generated a summary for this paper yet.