Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Interpretation with Explainable Knowledge Distillation (2111.06945v1)

Published 12 Nov 2021 in cs.LG and cs.AI

Abstract: Knowledge Distillation (KD) has been considered as a key solution in model compression and acceleration in recent years. In KD, a small student model is generally trained from a large teacher model by minimizing the divergence between the probabilistic outputs of the two. However, as demonstrated in our experiments, existing KD methods might not transfer critical explainable knowledge of the teacher to the student, i.e. the explanations of predictions made by the two models are not consistent. In this paper, we propose a novel explainable knowledge distillation model, called XDistillation, through which both the performance the explanations' information are transferred from the teacher model to the student model. The XDistillation model leverages the idea of convolutional autoencoders to approximate the teacher explanations. Our experiments shows that models trained by XDistillation outperform those trained by conventional KD methods not only in term of predictive accuracy but also faithfulness to the teacher models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Raed Alharbi (5 papers)
  2. Minh N. Vu (12 papers)
  3. My T. Thai (71 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.