Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 45 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 11 tok/s Pro

GPT-5 High 19 tok/s Pro

GPT-4o 88 tok/s Pro

Kimi K2 214 tok/s Pro

GPT OSS 120B 460 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Improving Knowledge Distillation via Transferring Learning Ability (2304.11923v2)

Published 24 Apr 2023 in cs.CV

Abstract: Existing knowledge distillation methods generally use a teacher-student approach, where the student network solely learns from a well-trained teacher. However, this approach overlooks the inherent differences in learning abilities between the teacher and student networks, thus causing the capacity-gap problem. To address this limitation, we propose a novel method called SLKD.

Citations (1)

View on Semantic Scholar