Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Weighted Distillation with Unlabeled Examples (2210.06711v1)

Published 13 Oct 2022 in cs.LG and cs.AI

Abstract: Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available, and then it is used to generate labels on an unlabeled dataset (typically much larger in size). These labels are then utilized to train the smaller ''student'' model which will actually be deployed. Naturally, the success of the approach depends on the quality of the teacher's labels, since the student could be confused if trained on inaccurate data. This paper proposes a principled approach for addressing this issue based on a ''debiasing'' reweighting of the student's loss function tailored to the distillation training paradigm. Our method is hyper-parameter free, data-agnostic, and simple to implement. We demonstrate significant improvements on popular academic datasets and we accompany our results with a theoretical analysis which rigorously justifies the performance of our method in certain settings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Fotis Iliopoulos (16 papers)
  2. Vasilis Kontonis (27 papers)
  3. Cenk Baykal (16 papers)
  4. Gaurav Menghani (10 papers)
  5. Khoa Trinh (10 papers)
  6. Erik Vee (14 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.