Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder (2310.03985v2)

Published 6 Oct 2023 in cs.CL, cs.LG, cs.SD, and eess.AS

Abstract: Dementia diagnosis requires a series of different testing methods, which is complex and time-consuming. Early detection of dementia is crucial as it can prevent further deterioration of the condition. This paper utilizes a speech recognition model to construct a dementia assessment system tailored for Mandarin speakers during the picture description task. By training an attention-based speech recognition model on voice data closely resembling real-world scenarios, we have significantly enhanced the model's recognition capabilities. Subsequently, we extracted the encoder from the speech recognition model and added a linear layer for dementia assessment. We collected Mandarin speech data from 99 subjects and acquired their clinical assessments from a local hospital. We achieved an accuracy of 92.04% in Alzheimer's disease detection and a mean absolute error of 9% in clinical dementia rating score prediction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. Geneva: World Health Organization;, “Global status report on the public health response to dementia.,” 09 2021.
  2. Louise Cummings, “Describing the cookie theft picture: Sources of breakdown in alzheimer’s dementia,” Pragmatics and Society, vol. 10, pp. 151–174, 03 2019.
  3. “Conformer Based Elderly Speech Recognition System for Alzheimer’s Disease Detection,” in Proc. Interspeech 2022, 2022, pp. 4825–4829.
  4. “Identifying mild cognitive impairment and mild alzheimer’s disease based on spontaneous speech using asr and linguistic features,” Computer Speech & Language, vol. 53, pp. 181–197, 2019.
  5. “Efficient pause extraction and encode strategy for alzheimer’s disease detection using only acoustic features from spontaneous speech,” Brain Sciences, vol. 13, pp. 477, 03 2023.
  6. “Dementia detection by fusing speech and eye-tracking representation,” in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6457–6461.
  7. “Dementia detection by analyzing spontaneous mandarin speech,” in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2019, pp. 289–296.
  8. “Feature selection and text embedding for detecting dementia from spontaneous cantonese,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5.
  9. “End-to-end asr-enhanced neural network for alzheimer’s disease diagnosis,” in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 8562–8566.
  10. “Exploiting pre-trained asr models for alzheimer’s disease recognition through spontaneous speech,” 2021.
  11. “Hubert: Self-supervised speech representation learning by masked prediction of hidden units,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. PP, pp. 1–1, 10 2021.
  12. “Aphasiabank: Methods for studying discourse,” Aphasiology, vol. 25, pp. 1286–1307, 11 2011.
  13. “Common voice: A massively-multilingual speech corpus,” in Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France, May 2020, pp. 4218–4222, European Language Resources Association.
  14. “Listen, attend and spell: A neural network for large vocabulary conversational speech recognition,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 4960–4964.
  15. “Advances in joint ctc-attention based end-to-end speech recognition with a deep cnn encoder and rnn-lm,” 06 2017.
  16. “Attention-based models for speech recognition,” 06 2015.
  17. “The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing,” IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 190–202, 2016.
  18. “The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language,” in Proc. Interspeech 2016, 2016, pp. 2001–2005.

Summary

We haven't generated a summary for this paper yet.