Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder (2310.03985v2)
Abstract: Dementia diagnosis requires a series of different testing methods, which is complex and time-consuming. Early detection of dementia is crucial as it can prevent further deterioration of the condition. This paper utilizes a speech recognition model to construct a dementia assessment system tailored for Mandarin speakers during the picture description task. By training an attention-based speech recognition model on voice data closely resembling real-world scenarios, we have significantly enhanced the model's recognition capabilities. Subsequently, we extracted the encoder from the speech recognition model and added a linear layer for dementia assessment. We collected Mandarin speech data from 99 subjects and acquired their clinical assessments from a local hospital. We achieved an accuracy of 92.04% in Alzheimer's disease detection and a mean absolute error of 9% in clinical dementia rating score prediction.
- Geneva: World Health Organization;, “Global status report on the public health response to dementia.,” 09 2021.
- Louise Cummings, “Describing the cookie theft picture: Sources of breakdown in alzheimer’s dementia,” Pragmatics and Society, vol. 10, pp. 151–174, 03 2019.
- “Conformer Based Elderly Speech Recognition System for Alzheimer’s Disease Detection,” in Proc. Interspeech 2022, 2022, pp. 4825–4829.
- “Identifying mild cognitive impairment and mild alzheimer’s disease based on spontaneous speech using asr and linguistic features,” Computer Speech & Language, vol. 53, pp. 181–197, 2019.
- “Efficient pause extraction and encode strategy for alzheimer’s disease detection using only acoustic features from spontaneous speech,” Brain Sciences, vol. 13, pp. 477, 03 2023.
- “Dementia detection by fusing speech and eye-tracking representation,” in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6457–6461.
- “Dementia detection by analyzing spontaneous mandarin speech,” in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2019, pp. 289–296.
- “Feature selection and text embedding for detecting dementia from spontaneous cantonese,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5.
- “End-to-end asr-enhanced neural network for alzheimer’s disease diagnosis,” in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 8562–8566.
- “Exploiting pre-trained asr models for alzheimer’s disease recognition through spontaneous speech,” 2021.
- “Hubert: Self-supervised speech representation learning by masked prediction of hidden units,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. PP, pp. 1–1, 10 2021.
- “Aphasiabank: Methods for studying discourse,” Aphasiology, vol. 25, pp. 1286–1307, 11 2011.
- “Common voice: A massively-multilingual speech corpus,” in Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France, May 2020, pp. 4218–4222, European Language Resources Association.
- “Listen, attend and spell: A neural network for large vocabulary conversational speech recognition,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 4960–4964.
- “Advances in joint ctc-attention based end-to-end speech recognition with a deep cnn encoder and rnn-lm,” 06 2017.
- “Attention-based models for speech recognition,” 06 2015.
- “The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing,” IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 190–202, 2016.
- “The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language,” in Proc. Interspeech 2016, 2016, pp. 2001–2005.