Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Black-box Few-shot Knowledge Distillation (2207.12106v1)

Published 25 Jul 2022 in cs.CV, cs.AI, and cs.LG

Abstract: Knowledge distillation (KD) is an efficient approach to transfer the knowledge from a large "teacher" network to a smaller "student" network. Traditional KD methods require lots of labeled training samples and a white-box teacher (parameters are accessible) to train a good student. However, these resources are not always available in real-world applications. The distillation process often happens at an external party side where we do not have access to much data, and the teacher does not disclose its parameters due to security and privacy concerns. To overcome these challenges, we propose a black-box few-shot KD method to train the student with few unlabeled training samples and a black-box teacher. Our main idea is to expand the training set by generating a diverse set of out-of-distribution synthetic images using MixUp and a conditional variational auto-encoder. These synthetic images along with their labels obtained from the teacher are used to train the student. We conduct extensive experiments to show that our method significantly outperforms recent SOTA few/zero-shot KD methods on image classification tasks. The code and models are available at: https://github.com/nphdang/FS-BBT

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Dang Nguyen (49 papers)
  2. Sunil Gupta (78 papers)
  3. Kien Do (35 papers)
  4. Svetha Venkatesh (160 papers)
Citations (14)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com