MC-ViViT: Multi-branch Classifier-ViViT to detect Mild Cognitive Impairment in older adults using facial videos (2304.05292v4)
Abstract: Deep machine learning models including Convolutional Neural Networks (CNN) have been successful in the detection of Mild Cognitive Impairment (MCI) using medical images, questionnaires, and videos. This paper proposes a novel Multi-branch Classifier-Video Vision Transformer (MC-ViViT) model to distinguish MCI from those with normal cognition by analyzing facial features. The data comes from the I-CONECT, a behavioral intervention trial aimed at improving cognitive function by providing frequent video chats. MC-ViViT extracts spatiotemporal features of videos in one branch and augments representations by the MC module. The I-CONECT dataset is challenging as the dataset is imbalanced containing Hard-Easy and Positive-Negative samples, which impedes the performance of MC-ViViT. We propose a loss function for Hard-Easy and Positive-Negative Samples (HP Loss) by combining Focal loss and AD-CORRE loss to address the imbalanced problem. Our experimental results on the I-CONECT dataset show the great potential of MC-ViViT in predicting MCI with a high accuracy of 90.63% accuracy on some of the interview videos.
- Alzheimer’s_Association (2021). 2021 alzheimer’s disease facts and figures. Alzheimer’s & Dementia, 17, 327–406.
- Vivit: A video vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6836–6846).
- Detection of mulberry ripeness stages using deep learning models. IEEE Access, 9, 100380–100394.
- A multi-modal, multi-atlas-based approach for alzheimer detection via machine learning. International Journal of Imaging Systems and Technology, 28, 113–123.
- Space-time mixing attention for video transformer. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, & J. W. Vaughan (Eds.), Advances in Neural Information Processing Systems (pp. 19594–19607). Curran Associates, Inc. volume 34.
- Carr, D. (2019). How to successfully navigate a revise-and-resubmit decision and handle rejections. Innovation in Aging, 3, S224.
- Digital biomarkers for the early detection of mild cognitive impairment: artificial intelligence meets virtual reality. Frontiers in Human Neuroscience, 14, 245.
- Topic-based measures of conversation for detecting mild cognitive impairment. In Proceedings of the conference. Association for Computational Linguistics. Meeting (p. 63). NIH Public Access volume 2020.
- A survey of different machine learning models for alzheimer disease prediction. International Journal of Emerging Trends in Engineering Research, 8.
- Dti based alzheimer’s disease classification with rank modulated fusion of cnns and random forest. Expert Systems with Applications, 169, 114338.
- Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5203–5212).
- A deep learning model to predict a diagnosis of alzheimer disease by using 18f-fdg pet of the brain. Radiology, 290, 456–464. PMID: 30398430.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, abs/2010.11929.
- Facial expression recognition patterns in mild and moderate alzheimer’s disease. Journal of Alzheimer’s Disease, 69, 1–11.
- Early recognition and treatment of neuropsychiatric symptoms to improve quality of life in early alzheimer’s disease: protocol of the beat-it study. Alzheimer’s research & therapy, 11, 1–12.
- Ad-corre: Adaptive correlation-based loss for facial expression recognition in the wild. IEEE Access, 10, 26756–26768.
- Discriminant distribution-agnostic loss for facial expression recognition in the wild. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (pp. 1631–1639).
- Facial expression recognition in the wild via deep attentive center loss. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (pp. 2402–2411).
- A survey on computer vision techniques for detecting facial features towards the early diagnosis of mild cognitive impairment in the elderly. Systems Science & Control Engineering, 7, 252–263.
- Hierarchical classification and transfer learning to recognize head gestures and facial expressions using earbuds. In Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 168–176).
- Alzheimer’s disease and face masks in times of covid-19. Journal of Alzheimer’s disease : JAD, 79, 9–14.
- Machine learning technology-based heart disease detection models. Journal of Healthcare Engineering, 2022.
- An ensemble of shapelet-based classifiers on inter-class and intra-class imbalanced multivariate time series at the early stage. Soft Computing, 23, 6097–6114.
- Timeconvnets: A deep time windowed convolution neural network design for real-time video facial expression recognition. In 2020 17th Conference on Computer and Robot Vision (CRV) (pp. 9–16).
- Towards training stronger video vision transformers for epic-kitchens-100 action recognition. arXiv preprint arXiv:2106.05058, .
- Early diagnosis of alzheimer’s disease: A neuroimaging study with deep learning architectures. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (pp. 1962–19622).
- Automated analysis of facial emotions in subjects with cognitive impairment. PLOS ONE, 17, 1–19.
- Diagnosing parkinson disease through facial expression recognition: video analysis. Journal of medical Internet research, 22, e18697.
- Multi-model and multi-slice ensemble learning architecture based on 2d convolutional neural networks for alzheimer’s disease diagnosis. Computers in Biology and Medicine, 136, 104678.
- Classification of human’s activities from gesture recognition in live videos using deep learning. Concurrency and Computation: Practice and Experience, 34, e6825.
- An online spatio-temporal tensor learning model for visual tracking and its applications to facial expression recognition. Expert Systems with Applications, 90, 427–438.
- Facial expression recognition with swin transformer. arXiv preprint arXiv:2203.13472, .
- Predicting alzheimer’s disease progression using multi-modal deep learning approach. Scientific reports, 9, 1952.
- A hierarchical model for learning to understand head gesture videos. Pattern Recognition, 121, 108256.
- Fine-grained facial expression recognition in the wild. IEEE Transactions on Information Forensics and Security, 16, 482–494.
- Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980–2988).
- A facial expression recognition system for smart learning based on yolo and vision transformer. In 2021 7th International Conference on Computing and Artificial Intelligence ICCAI 2021 (pp. 178–182). New York, NY, USA: Association for Computing Machinery.
- Video-based facial expression recognition using graph convolutional networks. In 2020 25th International Conference on Pattern Recognition (ICPR) (pp. 607–614). IEEE.
- Detection of mild cognitive impairment from language markers with crossmodal augmentation. In PACIFIC SYMPOSIUM ON BIOCOMPUTING 2023: Kohala Coast, Hawaii, USA, 3–7 January 2023 (pp. 7–18). World Scientific.
- Mfdnet: Collaborative poses perception and matrix fisher distribution for head pose estimation. IEEE Transactions on Multimedia, 24, 2449–2460.
- A multi-model deep convolutional neural network for automatic hippocampus segmentation and classification in alzheimer’s disease. Neuroimage, 208, 116459.
- Mutual information regularized identity-aware facial expression recognition in compressed video. Pattern Recognition, 119, 108105.
- Video swin transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3202–3211).
- Handling inter-class and intra-class imbalance in class-imbalanced learning. arXiv preprint arXiv:2111.12791, .
- A practical alzheimer disease classifier via brain imaging-based deep learning on 85,721 samples. bioRxiv, .
- Emotion detection deficits and decreased empathy in patients with alzheimer’s disease and parkinson’s disease affect caregiver mood and burden. Frontiers in Aging Neuroscience, 10.
- Face recognition deficits in a patient with alzheimer’s disease: Amnesia or agnosia? the importance of electrophysiological markers for differential diagnosis. Frontiers in Aging Neuroscience, 12.
- 1 - principles of epidemiology and public health. In S. S. Long (Ed.), Principles and Practice of Pediatric Infectious Diseases (Sixth Edition) (pp. 1–9.e1). Philadelphia: Elsevier. (Sixth edition ed.).
- Comparing the effect of interference on an emotional stroop task in older adults with and without alzheimer’s disease. Journal of Alzheimer’s disease, 73, 1445–1453.
- Alzheimer’s disease classification based on graph kernel svms constructed with 3d texture features extracted from mr images. Expert Systems with Applications, 211, 118633.
- Disease diagnosis with medical imaging using deep learning. In Future of Information and Communication Conference (pp. 198–208). Springer Springer International Publishing.
- Temporal based emotion recognition inspired by activity recognition models. In 2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) (pp. 01–08).
- Analyzing facial and eye movements to screen for alzheimer’s disease. Sensors, 20.
- National_Institute_on_Aging (2021). What is mild cognitive impairment?
- Transfer learning using freeze features for alzheimer neurological disorder detection using adni dataset. Multimedia Systems, 28, 85–94.
- Alzheimer disease prediction using machine learning algorithms. In 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS) (pp. 101–104).
- Weighted-center loss for facial expressions recognition. In 2020 International Conference on Information and Communication Technology Convergence (ICTC) (pp. 54–56).
- Video-based facial expression recognition using deep temporal-spatial networks. IETE Technical Review, 37, 402–409.
- Predicting progression from normal to mci and from mci to ad using clinical variables in the national alzheimer’s coordinating center uniform data set version 3: Application of machine learning models and a probability calculator. The journal of prevention of Alzheimer’s disease, 10, 301–313.
- Anomaly detection using edge computing in video surveillance system. International Journal of Multimedia Information Retrieval, 11, 85–110.
- A deep ensemble hippocampal cnn model for brain age estimation applied to alzheimer’s diagnosis. Expert Systems with Applications, 195, 116622.
- Development and validation of an interpretable deep learning framework for alzheimer’s disease classification. Brain, 143, 1920–1933.
- Machine learning for medical imaging-based covid-19 detection and diagnosis. International Journal of Intelligent Systems, 36, 5085–5115.
- A cnn model: Earlier diagnosis and classification of alzheimer disease using mri. In 2020 International Conference on Smart Electronics and Communication (ICOSEC) (pp. 156–161).
- A survey on generative adversarial networks for imbalance problems in computer vision tasks. Journal of big Data, 8, 1–59.
- Neuroanatomical correlates of recognizing face expressions in mild stages of alzheimer’s disease. PLoS ONE, 10, e0143586.
- Video-based analyses of parkinson’s disease severity: A brief review. Journal of Parkinson’s disease, 11, S83–S93.
- Video vision transformers for violence detection. arXiv preprint arXiv:2209.03561, .
- Review of automated emotion-based quantification of facial expression in parkinson’s patients. The Visual Computer, 37, 1151–1167.
- A transformer-based low-resolution face recognition method via on-and-offline knowledge distillation. Neurocomputing, 509, 193–205.
- Multimodal engagement analysis from facial videos in the classroom. IEEE Transactions on Affective Computing, (pp. 1–1).
- Xnodr and xnidr: Two accurate and fast fully connected layers for convolutional neural networks. arXiv preprint arXiv:2111.10854, .
- Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–9).
- Detecting dementia from face in human-agent interaction. Adjunct of the 2019 International Conference on Multimodal Interaction, (pp. 1–6).
- Scalable diagnostic screening of mild cognitive impairment using ai dialogue agent. Alzheimer’s & Dementia, 16.
- Dfer-net: Recognizing facial expression in the wild. In 2021 IEEE International Conference on Image Processing (ICIP) (pp. 2334–2338).
- Facial expression recognition in alzheimer’s disease: A systematic review. Journal of clinical and experimental neuropsychology, 41, 192–203.
- Screening of alzheimer’s disease by facial complexion using artificial intelligence. Aging, 13, 1765–1772.
- Tele (oral) medicine: A new approach during the covid-19 crisis. Oral Diseases, 27, 744.
- Can changes in social contact (frequency and mode) mitigate low mood before and during the covid-19 pandemic? the i-conect project. Journal of the American Geriatrics Society, 70, 669–676.
- An efficient multitask neural network for face alignment, head pose estimation and face tracking. Expert Systems with Applications, 205, 117368.
- Multiview transformers for video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3333–3343).
- The internet-based conversational engagement clinical trial (i-conect) in socially isolated adults 75+ years old: randomized controlled trial protocol and covid-19 related study modifications. Frontiers in digital health, 3, 714813.
- Msvt: Multiple spatiotemporal views transformer for deepfake video detection. IEEE Transactions on Circuits and Systems for Video Technology, (pp. 1–1).
- Multi-modal deep learning model for auxiliary diagnosis of alzheimer’s disease. Neurocomputing, 361, 185–195.