Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training (2206.08189v2)

Published 16 Jun 2022 in cs.SD and eess.AS

Abstract: Recent studies have shown that the benefits provided by self-supervised pre-training and self-training (pseudo-labeling) are complementary. Semi-supervised fine-tuning strategies under the pre-training framework, however, remain insufficiently studied. Besides, modern semi-supervised speech recognition algorithms either treat unlabeled data indiscriminately or filter out noisy samples with a confidence threshold. The dissimilarities among different unlabeled data are often ignored. In this paper, we propose Censer, a semi-supervised speech recognition algorithm based on self-supervised pre-training to maximize the utilization of unlabeled data. The pre-training stage of Censer adopts wav2vec2.0 and the fine-tuning stage employs an improved semi-supervised learning algorithm from slimIPL, which leverages unlabeled data progressively according to their pseudo labels' qualities. We also incorporate a temporal pseudo label pool and an exponential moving average to control the pseudo labels' update frequency and to avoid model divergence. Experimental results on Libri-Light and LibriSpeech datasets manifest our proposed method achieves better performance compared to existing approaches while being more unified.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Bowen Zhang (161 papers)
  2. Songjun Cao (15 papers)
  3. Xiaoming Zhang (113 papers)
  4. Yike Zhang (33 papers)
  5. Long Ma (116 papers)
  6. Takahiro Shinozaki (13 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.