Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Population Based Training for Data Augmentation and Regularization in Speech Recognition (2010.03899v1)

Published 8 Oct 2020 in cs.CL, cs.SD, and eess.AS

Abstract: Varying data augmentation policies and regularization over the course of optimization has led to performance improvements over using fixed values. We show that population based training is a useful tool to continuously search those hyperparameters, within a fixed budget. This greatly simplifies the experimental burden and computational cost of finding such optimal schedules. We experiment in speech recognition by optimizing SpecAugment this way, as well as dropout. It compares favorably to a baseline that does not change those hyperparameters over the course of training, with an 8% relative WER improvement. We obtain 5.18% word error rate on LibriSpeech's test-other.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Daniel Haziza (10 papers)
  2. Jérémy Rapin (20 papers)
  3. Gabriel Synnaeve (97 papers)
Citations (1)