Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 99 tok/s
Gemini 2.5 Pro 43 tok/s Pro
GPT-5 Medium 28 tok/s
GPT-5 High 35 tok/s Pro
GPT-4o 94 tok/s
GPT OSS 120B 476 tok/s Pro
Kimi K2 190 tok/s Pro
2000 character limit reached

Exploring Dark Knowledge under Various Teacher Capacities and Addressing Capacity Mismatch (2405.13078v2)

Published 21 May 2024 in cs.LG

Abstract: Knowledge Distillation (KD) could transfer the ``dark knowledge" of a well-performed yet large neural network to a weaker but lightweight one. From the view of output logits and softened probabilities, this paper goes deeper into the dark knowledge provided by teachers with different capacities. Two fundamental observations are: (1) a larger teacher tends to produce probability vectors with lower distinction among non-ground-truth classes; (2) teachers with different capacities are basically consistent in their cognition of relative class affinity. Through abundant experimental studies we verify these observations and provide in-depth empirical explanations to them. We argue that the distinctness among incorrect classes embodies the essence of dark knowledge. A larger and more accurate teacher lacks this distinctness, which hampers its teaching ability compared to a smaller teacher, ultimately leading to the peculiar phenomenon named "capacity mismatch". Building on this insight, this paper explores multiple simple yet effective ways to address capacity mismatch, achieving superior experimental results compared to previous approaches.

Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets