Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unbiased Knowledge Distillation for Recommendation (2211.14729v1)

Published 27 Nov 2022 in cs.IR and cs.AI

Abstract: As a promising solution for model compression, knowledge distillation (KD) has been applied in recommender systems (RS) to reduce inference latency. Traditional solutions first train a full teacher model from the training data, and then transfer its knowledge (\ie \textit{soft labels}) to supervise the learning of a compact student model. However, we find such a standard distillation paradigm would incur serious bias issue -- popular items are more heavily recommended after the distillation. This effect prevents the student model from making accurate and fair recommendations, decreasing the effectiveness of RS. In this work, we identify the origin of the bias in KD -- it roots in the biased soft labels from the teacher, and is further propagated and intensified during the distillation. To rectify this, we propose a new KD method with a stratified distillation strategy. It first partitions items into multiple groups according to their popularity, and then extracts the ranking knowledge within each group to supervise the learning of the student. Our method is simple and teacher-agnostic -- it works on distillation stage without affecting the training of the teacher model. We conduct extensive theoretical and empirical studies to validate the effectiveness of our proposal. We release our code at: https://github.com/chengang95/UnKD.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Gang Chen (592 papers)
  2. Jiawei Chen (161 papers)
  3. Fuli Feng (143 papers)
  4. Sheng Zhou (186 papers)
  5. Xiangnan He (200 papers)
Citations (17)