Theoretical analysis of why pseudo-labeling improves ASR performance
Develop a theoretical framework that explains why training Conformer-1 with large-scale pseudo-labeled speech data via Noisy Student Training yields empirical improvements in Word Error Rate and robustness, and rigorously evaluate hypothesized mechanisms such as suppression of outlier samples and expanded coverage of the training distribution to establish a scientific basis for these effects.
Sponsor
References
However this conclusion is purely based off empirical results and further theoretical analysis of these results is an open area of exploration. We hypothesize that pseudo-labels are helping for different reasons, including suppressing the negative effects of outlier samples and covering a wider train distribution, but want to get a more scientific basis for our explanations.