Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 78 tok/s
Gemini 2.5 Pro 43 tok/s Pro
GPT-5 Medium 23 tok/s
GPT-5 High 29 tok/s Pro
GPT-4o 93 tok/s
GPT OSS 120B 470 tok/s Pro
Kimi K2 183 tok/s Pro
2000 character limit reached

A Novel Audio-Visual Information Fusion System for Mental Disorders Detection (2409.02243v1)

Published 3 Sep 2024 in cs.CV

Abstract: Mental disorders are among the foremost contributors to the global healthcare challenge. Research indicates that timely diagnosis and intervention are vital in treating various mental disorders. However, the early somatization symptoms of certain mental disorders may not be immediately evident, often resulting in their oversight and misdiagnosis. Additionally, the traditional diagnosis methods incur high time and cost. Deep learning methods based on fMRI and EEG have improved the efficiency of the mental disorder detection process. However, the cost of the equipment and trained staff are generally huge. Moreover, most systems are only trained for a specific mental disorder and are not general-purpose. Recently, physiological studies have shown that there are some speech and facial-related symptoms in a few mental disorders (e.g., depression and ADHD). In this paper, we focus on the emotional expression features of mental disorders and introduce a multimodal mental disorder diagnosis system based on audio-visual information input. Our proposed system is based on spatial-temporal attention networks and innovative uses a less computationally intensive pre-train audio recognition network to fine-tune the video recognition module for better results. We also apply the unified system for multiple mental disorders (ADHD and depression) for the first time. The proposed system achieves over 80\% accuracy on the real multimodal ADHD dataset and achieves state-of-the-art results on the depression dataset AVEC 2014.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. C. Nash, R. Nair, and S. M. Naqvi, “Machine learning and ADHD mental health detection-a short survey,” IEEE International Conference on Information Fusion (FUSION), 2022.
  2. Y. Chen, Y. Tang, C. Wang, X. Liu, L. Zhao, and Z. Wang, “ADHD classification by dual subspace learning using resting-state functional connectivity,” Artificial intelligence in medicine, vol. 103, p. 101786, 2020.
  3. M. Sajjadian, R. W. Lam, R. Milev, S. Rotzinger, B. N. Frey, C. N. Soares, S. V. Parikh, J. A. Foster, G. Turecki, D. J. Müller et al., “Machine learning in the prediction of depression treatment outcomes: a systematic review and meta-analysis,” Psychological Medicine, vol. 51, no. 16, pp. 2742–2751, 2021.
  4. S. Li, Y. Sun, R. Nair, and S. M. Naqvi, “Enhancing ADHD detection using DIVA interview-based audio signals and a two-stream network,” IEEE International Performance Computing and Communications Conference (IPCC), 2023.
  5. C. Nash, R. Nair, and S. M. Naqvi, “Machine learning in ADHD and depression mental health diagnosis: a survey,” IEEE Access, vol. 11, no. 2, pp. 86 297–86 317, 2023.
  6. C. Ouyang, Y. Chiu, C. Chiang, R. Wu, Y. Lin, R. Yang, and L. Lin, “Evaluating therapeutic effects of ADHD medication objectively by movement quantification with a video-based skeleton analysis,” International Journal of Environmental Research and Public Health, vol. 18, no. 17, p. 9363, 2021.
  7. S. Li, R. Nair, and M. Naqvi, “Acoustic and text features analysis for adult ADHD screening: A data-driven approach utilizing DIVA interview,” IEEE Journal of Translational Engineering in Health and Medicine, pp. 1–1, 2024.
  8. A. P. Americana, “Diagnostic and statistical manual of mental disorders,” The American Psychiatric Association, vol. 5, pp. 591–643, 2013.
  9. Y. Li, R. Nair, and S. M. Naqvi, “Video-based skeleton data analysis for ADHD detection,” in Symposium Series on Computational Intelligence (SSCI), 2023, pp. 1–6.
  10. W. C. De Melo, E. Granger, and M. B. Lopez, “Encoding temporal information for automatic depression recognition from facial analysis,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020.
  11. Y. Li, S. Li, C. Nash, S. M. Naqvi, and R. Nair, “24 intelligent sensing in ADHD trial – pilot study,” Journal of Neurology, Neurosurgery & Psychiatry, vol. 94, no. 12, p. e2, 2023. [Online]. Available: https://jnnp.bmj.com/content/94/12/e2.35
  12. H. W. Loh, C. P. Ooi, P. D. Barua, E. E. Palmer, F. Molinari, and U. R. Acharya, “Automated detection of ADHD: current trends and future perspective,” Computers in Biology and Medicine, vol. 146, pp. 1–18, 2022.
  13. J. A. Russell, “Measures of emotion,” in The measurement of emotions.   Elsevier, 1989, pp. 83–111.
  14. S. Sharma, P. Jasper, E. Muth, and A. Hoover, “The impact of walking and resting on wrist motion for automated detection of meals,” ACM Transactions on Computing for Healthcare, vol. 1, no. 4, pp. 1–19, 2020.
  15. E. A. Ríssola, D. E. Losada, and F. Crestani, “A survey of computational methods for online mental state assessment on social media,” ACM Transactions on Computing for Healthcare, vol. 2, no. 2, pp. 1–31, 2021.
  16. Y. Tang, J. Sun, C. Wang, Y. Zhong, A. Jiang, G. Liu, and X. Liu, “Adhd classification using auto-encoding neural network and binary hypothesis testing,” Artificial Intelligence in Medicine, vol. 123, pp. 102 209–102 230, 2022.
  17. Q. Li, F. Dong, Q. Gai, K. Che, H. Ma, F. Zhao, T. Chu, N. Mao, and P. Wang, “Diagnosis of major depressive disorder using machine learning based on multisequence mri neuroimaging features,” Journal of Magnetic Resonance Imaging, vol. 58, no. 5, pp. 1420–1430, 2023.
  18. M. Niu, J. Tao, Y. Li, Y. Qin, and Y. Li, “Wavdepressionnet: Automatic depression level prediction via raw speech signals,” IEEE Transactions on Affective Computing, vol. 15, no. 5, pp. 285–296, 2023.
  19. P. Bellec, C. Chu, F. Chouinard-Decorte, Y. Benhajali, D. S. Margulies, and R. C. Craddock, “The neuro bureau ADHD-200 preprocessed repository,” Neuroimage, vol. 144, pp. 275–286, 2017.
  20. Y. Tang, J. Sun, C. Wang, Y. Zhong, A. Jiang, G. Liu, and X. Liu, “Adhd classification using auto-encoding neural network and binary hypothesis testing,” Artificial Intelligence in Medicine, vol. 123, p. 102209, 2022.
  21. Y. Li, Y. Sun, and S. Mohsen Naqvi, “Single-channel dereverberation and denoising based on lower band trained sa-lstms,” IET Signal Processing, vol. 14, no. 10, pp. 774–782, 2020.
  22. Y. Li, Y. Sun, K. Horoshenkov, and S. M. Naqvi, “Domain adaptation and autoencoder-based unsupervised speech enhancement,” IEEE Transactions on Artificial Intelligence, vol. 3, no. 1, pp. 43–52, 2022.
  23. Y. Pan, Y. Shang, T. Liu, Z. Shao, G. Guo, H. Ding, and Q. Hu, “Spatial–temporal attention network for depression recognition from facial videos,” Expert Systems with Applications, vol. 237, p. 121410, 2024.
  24. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3D convolutional networks,” IEEE International Conference on Computer Vision (ICCV), 2015.
  25. D. Tran, H. Wang, L. Torresani, J. Ray, Y. LeCun, D. Paluri, M.Tran, H. Wang, L. Torresani, J. Ray, Y. LeCun, and M. Paluri, “A closer look at spatiotemporal convolutions for action recognition,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  26. Y. Li, Y. Sun, and S. M. Naqvi, “PSD and signal approximation-lstm based speech enhancement,” in International Conference on Signal Processing and Communication Systems (ICSPCS), 2019.
  27. Y. Luo, T. L. Alvarez, J. M. Halperin, and X. Li, “Multimodal neuroimaging-based prediction of adult outcomes in childhood-onset ADHD using ensemble learning techniques,” NeuroImage: Clinical, vol. 26, p. 102238, 2020.
  28. J. Peng, M. Debnath, and A. K. Biswas, “Efficacy of novel summation-based synergetic artificial neural network in adhd diagnosis,” Machine Learning with Applications, vol. 6, p. 100120, 2021.
  29. A. Vahid, A. Bluschke, V. Roessner, S. Stober, and C. Beste, “Deep learning based on event-related eeg differentiates children with adhd from healthy controls,” Journal of clinical medicine, vol. 8, no. 7, p. 1055, 2019.
  30. M. Valstar, B. Schuller, K. Smith, T. Almaev, F. Eyben, J. Krajewski, R. Cowie, and M. Pantic, “Avec 2014: 3d dimensional affect and depression recognition challenge,” in Proceedings of the 4th international workshop on audio/visual emotion challenge, 2014.
  31. M. Niu, J. Tao, B. Liu, J. Huang, and Z. Lian, “Multimodal spatiotemporal representation for automatic depression level detection,” IEEE transactions on affective computing, vol. 14, no. 1, pp. 294–307, 2020.
  32. M. Niu, J. Tao, B. Liu, and C. Fan, “Automatic depression level detection via lp-norm pooling,” Proc. INTERSPEECH, Graz, Austria, 2019.
  33. Z. Du, W. Li, D. Huang, and Y. Wang, “Encoding visual behaviors with attentive temporal convolution for depression prediction,” in IEEE international conference on automatic face & gesture recognition (FG 2019).   IEEE, 2019.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.