Improving Knowledge Distillation via Transferring Learning Ability (2304.11923v2)
Abstract: Existing knowledge distillation methods generally use a teacher-student approach, where the student network solely learns from a well-trained teacher. However, this approach overlooks the inherent differences in learning abilities between the teacher and student networks, thus causing the capacity-gap problem. To address this limitation, we propose a novel method called SLKD.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.