Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Significance of Data Augmentation for Improving Cleft Lip and Palate Speech Recognition (2110.00797v1)

Published 2 Oct 2021 in eess.AS and cs.SD

Abstract: The automatic recognition of pathological speech, particularly from children with any articulatory impairment, is a challenging task due to various reasons. The lack of available domain specific data is one such obstacle that hinders its usage for different speech-based applications targeting pathological speakers. In line with the challenge, in this work, we investigate a few data augmentation techniques to simulate training data for improving the children speech recognition considering the case of cleft lip and palate (CLP) speech. The augmentation techniques explored in this study, include vocal tract length perturbation (VTLP), reverberation, speaking rate, pitch modification, and speech feature modification using cycle consistent adversarial networks (CycleGAN). Our study finds that the data augmentation methods significantly improve the CLP speech recognition performance, which is more evident when we used feature modification using CycleGAN, VTLP and reverberation based methods. More specifically, the results from this study show that our systems produce an improved phone error rate compared to the systems without data augmentation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Protima Nomo Sudro (3 papers)
  2. Rohan Kumar Das (50 papers)
  3. Rohit Sinha (16 papers)
  4. S. R. Mahadeva Prasanna (76 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.