Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A study on the impact of Self-Supervised Learning on automatic dysarthric speech assessment (2306.04337v2)

Published 7 Jun 2023 in cs.CL

Abstract: Automating dysarthria assessments offers the opportunity to develop practical, low-cost tools that address the current limitations of manual and subjective assessments. Nonetheless, the small size of most dysarthria datasets makes it challenging to develop automated assessment. Recent research showed that speech representations from models pre-trained on large unlabelled data can enhance Automatic Speech Recognition (ASR) performance for dysarthric speech. We are the first to evaluate the representations from pre-trained state-of-the-art Self-Supervised models across three downstream tasks on dysarthric speech: disease classification, word recognition and intelligibility classification, and under three noise scenarios on the UA-Speech dataset. We show that HuBERT is the most versatile feature extractor across dysarthria classification, word recognition, and intelligibility classification, achieving respectively $+24.7\%, +61\%, \text{and} +7.2\%$ accuracy compared to classical acoustic features.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. ``Effect of boost articulation therapy (bart) on intelligibility in adults with dysarthria,'' International Journal of Language & Communication Disorders, vol. 56, no. 2, pp. 271–282, 2021.
  2. ``Dysarthric Speech Recognition From Raw Waveform with Parametric CNNs,'' in Proc. Interspeech, 2022, pp. 31–35.
  3. ``Hubert: Self-supervised speech representation learning by masked prediction of hidden units,'' IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 3451–3460, 2021.
  4. ``A survey of technologies for automatic dysarthric speech recognition,'' EURASIP Journal on Audio, Speech, and Music Processing, p. 48, 2023.
  5. ``Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition,'' in Proc. Interspeech, 2022, pp. 41–45.
  6. ``An investigation to identify optimal setup for automated assessment of dysarthric intelligibility using deep learning technologies,'' Cognitive Computation, 2022.
  7. ``Characterization of atypical vocal source excitation, temporal dynamics and prosody for objective measurement of dysarthric word intelligibility,'' Speech Communication, vol. 54, no. 5, pp. 622–631, 2012.
  8. ``Improved speaker independent dysarthria intelligibility classification using deepspeech posteriors,'' in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020.
  9. ``Dysarthric speech database for universal access research,'' in Ninth Annual Conference of the International Speech Communication Association, 2008.
  10. ``Automated dysarthria severity classification: A study on acoustic features and deep learning techniques,'' IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 30, pp. 1147–1157, 2022.
  11. ``Investigating the Impact of Speech Compression on the Acoustics of Dysarthric Speech,'' in Proc. Interspeech, 2022, pp. 2263–2267.
  12. ``Unsupervised pretraining transfers well across languages,'' in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 7414–7418.
  13. Weifeng Fu, ``Application of an isolated word speech recognition system in the field of mental health consultation: Development and usability study,'' JMIR Medical Informatics, vol. 8, no. 6, pp. e18677, 2020.
  14. ``Small-footprint keyword spotting using deep neural networks,'' in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014, pp. 4087–4091.
  15. ``An acoustic study of the relationships among neurologic disease, dysarthria type, and severity of dysarthria,'' Journal of Speech, Language, and Hearing Research, 2011.
  16. ``A review of automated intelligibility assessment for dysarthric speakers,'' in International Conference on Speech Technology and Human-Computer Dialogue (SpeD). IEEE, 2021, pp. 19–24.
  17. ``Classification of dysarthric speech according to the severity of impairment: an analysis of acoustic features,'' IEEE Access, vol. 9, pp. 18183–18194, 2021.
  18. Digital audio restoration, Springer, 2013.
  19. ``Voicefixer: Toward general speech restoration with neural vocoder,'' arXiv preprint arXiv:2109.13731, 2021.
  20. ``Wham!: Extending speech separation to noisy environments,'' arXiv preprint arXiv:1907.01160, 2019.
Citations (1)

Summary

We haven't generated a summary for this paper yet.