Identification of Cognitive Decline from Spoken Language through Feature Selection and the Bag of Acoustic Words Model (2402.01824v1)
Abstract: Memory disorders are a central factor in the decline of functioning and daily activities in elderly individuals. The confirmation of the illness, initiation of medication to slow its progression, and the commencement of occupational therapy aimed at maintaining and rehabilitating cognitive abilities require a medical diagnosis. The early identification of symptoms of memory disorders, especially the decline in cognitive abilities, plays a significant role in ensuring the well-being of populations. Features related to speech production are known to connect with the speaker's cognitive ability and changes. The lack of standardized speech tests in clinical settings has led to a growing emphasis on developing automatic machine learning techniques for analyzing naturally spoken language. Non-lexical but acoustic properties of spoken language have proven useful when fast, cost-effective, and scalable solutions are needed for the rapid diagnosis of a disease. The work presents an approach related to feature selection, allowing for the automatic selection of the essential features required for diagnosis from the Geneva minimalistic acoustic parameter set and relative speech pauses, intended for automatic paralinguistic and clinical speech analysis. These features are refined into word histogram features, in which machine learning classifiers are trained to classify control subjects and dementia patients from the Dementia Bank's Pitt audio database. The results show that achieving a 75% average classification accuracy with only twenty-five features with the separate ADReSS 2020 competition test data and the Leave-One-Subject-Out cross-validation of the entire competition data is possible. The results rank at the top compared to international research, where the same dataset and only acoustic features have been used to diagnose patients.
- M. W. Bondi, D. P. Salmon, and A. W. Kaszniak, “The neuropsychology of dementia,” Neuropsychological assessment of neuropsychiatric and neuromedical disorders, pp. 159–198, 2009.
- W. H. Organization et al., “Global action plan on the public health response to dementia 2017–2025,” 2017.
- R. N. Kalaria, G. E. Maestre, R. Arizaga, R. P. Friedland, D. Galasko, K. Hall, J. A. Luchsinger, A. Ogunniyi, E. K. Perry, F. Potocnik, et al., “Alzheimer’s disease and vascular dementia in developing countries: prevalence, management, and risk factors,” The Lancet Neurology, vol. 7, no. 9, pp. 812–826, 2008.
- T. Ngandu, J. Lehtisalo, A. Solomon, E. Levälahti, S. Ahtiluoto, R. Antikainen, L. Bäckman, T. Hänninen, A. Jula, T. Laatikainen, et al., “A 2 year multidomain intervention of diet, exercise, cognitive training, and vascular risk monitoring versus control to prevent cognitive decline in at-risk elderly people (FINGER): a randomised controlled trial,” The Lancet, vol. 385, no. 9984, pp. 2255–2263, 2015.
- A. Konig, A. Satt, A. Sorin, R. Hoory, A. Derreumaux, R. David, and P. H. Robert, “Use of speech analyses within a mobile application for the assessment of cognitive impairment in elderly people,” Current Alzheimer Research, vol. 15, no. 2, pp. 120–129, 2018.
- A. Roshanzamir, H. Aghajan, and M. Soleymani Baghshah, “Transformer-based deep neural network language models for alzheimer’s disease risk assessment from targeted speech,” BMC Medical Informatics and Decision Making, vol. 21, pp. 1–14, 2021.
- C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” in International conference on machine learning, pp. 1321–1330, PMLR, 2017.
- F. Haider, S. De La Fuente, and S. Luz, “An assessment of paralinguistic acoustic features for detection of alzheimer’s dementia in spontaneous speech,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 2, pp. 272–281, 2019.
- S. de la Fuente Garcia, C. W. Ritchie, and S. Luz, “Artificial intelligence, speech, and language processing approaches to monitoring alzheimer’s disease: a systematic review,” Journal of Alzheimer’s Disease, vol. 78, no. 4, pp. 1547–1574, 2020.
- L. Hernández-Domínguez, S. Ratté, G. Sierra-Martínez, and A. Roche-Bergua, “Computer-based evaluation of alzheimer’s disease and mild cognitive impairment patients during a picture description task,” Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring, vol. 10, pp. 260–268, 2018.
- S. Luz, “Longitudinal monitoring and detection of alzheimer’s type dementia from spontaneous speech data,” in 2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS), pp. 45–46, IEEE, 2017.
- K. Lopez-de Ipiña, J. B. Alonso, J. Solé-Casals, N. Barroso, P. Henriquez, M. Faundez-Zanuy, C. M. Travieso, M. Ecay-Torres, P. Martinez-Lage, and H. Eguiraun, “On automatic diagnosis of alzheimer’s disease based on spontaneous speech analysis and emotional temperature,” Cognitive Computation, vol. 7, pp. 44–55, 2015.
- S. Luz, F. Haider, S. de la Fuente Garcia, D. Fromm, and B. Macwhinney, “Alzheimer’s dementia recognition through spontaneous speech: The adress challenge,” in Interspeech, vol. 2020, pp. 2020–2571, 2020.
- F. Eyben, F. Weninger, F. Gross, and B. Schuller, “Recent developments in opensmile, the munich open-source multimedia feature extractor,” in Proceedings of the 21st ACM international conference on Multimedia, pp. 835–838, 2013.
- F. Eyben, K. R. Scherer, B. W. Schuller, J. Sundberg, E. André, C. Busso, L. Y. Devillers, J. Epps, P. Laukka, S. S. Narayanan, et al., “The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing,” IEEE transactions on affective computing, vol. 7, no. 2, pp. 190–202, 2015.
- F. Eyben, M. Wöllmer, and B. Schuller, “Opensmile: the munich versatile and fast open-source audio feature extractor,” in Proceedings of the 18th ACM international conference on Multimedia, pp. 1459–1462, 2010.
- M. S. S. Syed, Z. S. Syed, M. Lech, and E. Pirogova, “Automated screening for alzheimer’s dementia through spontaneous speech.,” in Interspeech, vol. 2020, pp. 2222–6, 2020.
- M. Schmitt and B. Schuller, “openxbow – introducing the passau open-source crossmodal bag-of-words toolkit,” Journal of Machine Learning Research, vol. 18, no. 96, pp. 1–5, 2017.
- M. E. Celebi, H. A. Kingravi, and P. A. Vela, “A comparative study of efficient initialization methods for the k-means clustering algorithm,” Expert systems with applications, vol. 40, no. 1, pp. 200–210, 2013.
- J. Hämäläinen, S. Jauhiainen, and T. Kärkkäinen, “Comparison of internal clustering validation indices for prototype-based clustering,” Algorithms, vol. 10, no. 3, p. 105, 2017.
- M. Niemelä and T. Kärkkäinen, “Improving clustering and cluster validation with missing data using distance estimation methods,” in Computational Sciences and Artificial Intelligence in Industry, pp. 123–133, Springer, 2022.
- J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-means clustering algorithm,” Journal of the royal statistical society. series c (applied statistics), vol. 28, no. 1, pp. 100–108, 1979.
- J. T. Becker, F. Boiler, O. L. Lopez, J. Saxton, and K. L. McGonigle, “The natural history of alzheimer’s disease: description of study cohort and accuracy of diagnosis,” Archives of neurology, vol. 51, no. 6, pp. 585–594, 1994.
- M. F. Folstein, S. E. Folstein, and P. R. McHugh, ““mini-mental state”: a practical method for grading the cognitive state of patients for the clinician,” Journal of psychiatric research, vol. 12, no. 3, pp. 189–198, 1975.
- L. Breiman, “Random forests,” Machine learning, vol. 45, pp. 5–32, 2001.
- T. Joachims, “Text categorization with support vector machines: Learning with many relevant features,” in European conference on machine learning, pp. 137–142, Springer, 1998.
- G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual categorization with bags of keypoints,” in Workshop on statistical learning in computer vision, ECCV, vol. 1, pp. 1–2, Prague, 2004.
- F. Weninger, P. Staudt, and B. Schuller, “Words that fascinate the listener: Predicting affective ratings of on-line lectures,” International Journal of Distance Education Technologies (IJDET), vol. 11, no. 2, pp. 110–123, 2013.
- A. K. Jain, “Data clustering: 50 years beyond k-means,” Pattern recognition letters, vol. 31, no. 8, pp. 651–666, 2010.
- S. Äyrämö, Knowledge mining using robust clustering. No. 63, University of Jyväskylä, 2006.
- S. Äyrämö, T. Kärkkäinen, and K. Majava, “Robust refinement of initial prototypes for partitioning-based clustering algorithms,” in Recent Advances in Stochastic Modeling and Data Analysis, pp. 473–482, World Scientific, 2007.
- D. Arthur, S. Vassilvitskii, et al., “k-means++: The advantages of careful seeding,” in Soda, vol. 7, pp. 1027–1035, 2007.
- J. Zhang, M. Marszałek, S. Lazebnik, and C. Schmid, “Local features and kernels for classification of texture and object categories: A comprehensive study,” International journal of computer vision, vol. 73, pp. 213–238, 2007.
- F. Wilcoxon, “Individual comparisons by ranking methods,” in Breakthroughs in Statistics: Methodology and Distribution, pp. 196–202, Springer, 1992.
- F. Haider, S. Pollak, P. Albert, and S. Luz, “Emotion recognition in low-resource settings: An evaluation of automatic feature selection methods,” Computer Speech & Language, vol. 65, p. 101119, 2021.
- M. Schmitt, F. Ringeval, and B. Schuller, “At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech,” in Proc. Interspeech 2016, pp. 495–499, 2016.
- D. V. L. Sidtis, W. Hanson, C. Jackson, A. Lanto, D. Kempler, and E. J. Metter, “Fundamental frequency (f0) measures comparing speech tasks in aphasia and parkinson disease,” Journal of Medical Speech-Language Pathology, vol. 12, no. 4, pp. 207–213, 2004.
- M. Little, P. McSharry, E. Hunter, J. Spielman, and L. Ramig, “Suitability of dysphonia measurements for telemonitoring of parkinson’s disease,” Nature Precedings, pp. 1–1, 2008.
- J. E. Dimsdale, “Psychological stress and cardiovascular disease,” Journal of the American College of Cardiology, vol. 51, no. 13, pp. 1237–1246, 2008.
- B. Desmet and V. Hoste, “Emotion detection in suicide notes,” Expert Systems with Applications, vol. 40, no. 16, pp. 6351–6358, 2013.
- Marko Niemelä (2 papers)
- Mikaela von Bonsdorff (2 papers)
- Sami Äyrämö (5 papers)
- Tommi Kärkkäinen (16 papers)