Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech Recognition (2203.14593v3)

Published 28 Mar 2022 in eess.AS, cs.AI, cs.LG, and cs.SD

Abstract: Accurate recognition of dysarthric and elderly speech remain challenging tasks to date. Speaker-level heterogeneity attributed to accent or gender, when aggregated with age and speech impairment, create large diversity among these speakers. Scarcity of speaker-level data limits the practical use of data-intensive model based speaker adaptation methods. To this end, this paper proposes two novel forms of data-efficient, feature-based on-the-fly speaker adaptation methods: variance-regularized spectral basis embedding (SVR) and spectral feature driven f-LHUC transforms. Experiments conducted on UASpeech dysarthric and DementiaBank Pitt elderly speech corpora suggest the proposed on-the-fly speaker adaptation approaches consistently outperform baseline iVector adapted hybrid DNN/TDNN and E2E Conformer systems by statistically significant WER reduction of 2.48%-2.85% absolute (7.92%-8.06% relative), and offline model based LHUC adaptation by 1.82% absolute (5.63% relative) respectively.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Mengzhe Geng (42 papers)
  2. Xurong Xie (38 papers)
  3. Rongfeng Su (5 papers)
  4. Jianwei Yu (64 papers)
  5. Zengrui Jin (30 papers)
  6. Tianzi Wang (37 papers)
  7. Shujie Hu (36 papers)
  8. Zi Ye (20 papers)
  9. Helen Meng (204 papers)
  10. Xunying Liu (92 papers)
Citations (4)