Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT (2110.01900v4)

Published 5 Oct 2021 in cs.CL and eess.AS

Abstract: Self-supervised speech representation learning methods like wav2vec 2.0 and Hidden-unit BERT (HuBERT) leverage unlabeled speech data for pre-training and offer good representations for numerous speech processing tasks. Despite the success of these methods, they require large memory and high pre-training costs, making them inaccessible for researchers in academia and small companies. Therefore, this paper introduces DistilHuBERT, a novel multi-task learning framework to distill hidden representations from a HuBERT model directly. This method reduces HuBERT's size by 75% and 73% faster while retaining most performance in ten different tasks. Moreover, DistilHuBERT required little training time and data, opening the possibilities of pre-training personal and on-device SSL models for speech.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Heng-Jui Chang (16 papers)
  2. Shu-wen Yang (17 papers)
  3. Hung-yi Lee (327 papers)
Citations (150)

Summary

We haven't generated a summary for this paper yet.