M&M: Multimodal-Multitask Model Integrating Audiovisual Cues in Cognitive Load Assessment (2403.09451v1)
Abstract: This paper introduces the M&M model, a novel multimodal-multitask learning framework, applied to the AVCAffe dataset for cognitive load assessment (CLA). M&M uniquely integrates audiovisual cues through a dual-pathway architecture, featuring specialized streams for audio and video inputs. A key innovation lies in its cross-modality multihead attention mechanism, fusing the different modalities for synchronized multitasking. Another notable feature is the model's three specialized branches, each tailored to a specific cognitive load label, enabling nuanced, task-specific analysis. While it shows modest performance compared to the AVCAffe's single-task baseline, M&M demonstrates a promising framework for integrated multimodal processing. This work paves the way for future enhancements in multimodal-multitask learning systems, emphasizing the fusion of diverse data types for complex task handling.
- Multi-channel spectrograms for speech processing applications using deep learning methods. Pattern Analysis and Applications, 24:423–431.
- Beecham, R. et al. (2017). The impact of cognitive load theory on the practice of instructional design. Educational Psychology Review, 29(2):239–254.
- How cognitive load affects duration judgments: A meta-analytic review. Acta Psychologica, 134(3):330–343.
- Busso, C. et al. (2008). Iemocap: Interactive emotional dyadic motion capture database. Language Resources and Evaluation, 42(4):335–359.
- Classification of drivers’ mental workload levels: Comparison of machine learning methods based on ecg and infrared thermal signals. Sensors (Basel, Switzerland), 22.
- Quo vadis, action recognition? a new model and the kinetics dataset.
- Chen, S. (2020). Multimodal event-based task load estimation from wearables. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1–9.
- Matt: Multimodal attention level estimation for e-learning platforms. ArXiv, abs/2301.09174.
- Multimodal multitask deep learning model for alzheimer’s disease progression detection based on time series data. Neurocomputing, 412:197–215.
- Fundamental cognitive workload assessment: A machine learning comparative approach. In 2017 Conference Proceedings, pages 275–284. Springer.
- Cross-connected networks for multi-task learning of detection and segmentation. 2019 IEEE International Conference on Image Processing (ICIP), pages 3636–3640.
- Cognitive state monitoring and the design of adaptive instruction in digital environments: lessons learned from cognitive workload assessment using a passive brain-computer interface approach. Frontiers in Neuroscience, 8.
- Hemodynamic analysis for cognitive load assessment and classification in motor learning tasks using type-2 fuzzy sets. IEEE Transactions on Emerging Topics in Computational Intelligence, 3:245–260.
- Datasets for cognitive load inference using wearable sensors and psychological traits. Applied Sciences.
- Group, N. N. (2013). Minimize cognitive load to maximize usability.
- Effective assessment of cognitive load in real-world scenarios using wrist-worn sensor data. In Proceedings of the Workshop on Body-Centric Computing Systems.
- Deep convolutional neural networks for mental load classification based on eeg data. Pattern Recognition, 76:582–595.
- Measuring cognitive load using linguistic features: Implications for usability evaluation and adaptive interaction design. International Journal of Human-Computer Interaction, 30:343–368.
- Kossaifi, J. et al. (2019). Sewa db: A rich database for audio-visual emotion and sentiment research in the wild. Transactions on Pattern Analysis and Machine Intelligence.
- Kress, G. (2009). Multimodality: A Social Semiotic Approach to Contemporary Communication. Routledge, London ; New York.
- Cognitive analysis of working memory load from eeg, by a deep recurrent neural network. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2576–2580.
- Tracking of mental workload with a mobile eeg sensor. Sensors (Basel, Switzerland), 21(15).
- Mmod-cog: A database for multimodal cognitive load classification. 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA), pages 15–20.
- Clinical risk prediction with temporal probabilistic asymmetric multi-task learning. ArXiv, abs/2006.12777.
- Adabase: A multimodal dataset for cognitive load estimation. Sensors (Basel, Switzerland).
- Preliminary classification of cognitive load states in a human machine interaction scenario. In 2017 International Conference on Companion Technology (ICCT), pages 1–5. IEEE.
- Cognitive load theory: A broader view on the role of memory in learning and education. Educational Psychology Review, 32:1053–1072.
- Formal algorithms for transformers.
- Multitask representation learning for multimodal estimation of depression level. IEEE Intelligent Systems, 34:45–52.
- Ruder, S. (2017). An overview of multi-task learning in deep neural networks. ArXiv, abs/1706.05098.
- Classification of eeg signals for cognitive load estimation using deep learning architectures. In 2018 Conference Proceedings, pages 59–68. Springer.
- Avcaffe: A large scale audio-visual dataset of cognitive load in remote work environments.
- Identifying beneficial task relations for multi-task learning in deep neural networks. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pages 164–169.
- Bioinspired multisensory neural network with crossmodal integration and recognition. Nature Communications, 12.
- Personalized multitask learning for predicting tomorrow’s mood, stress, and health. IEEE Transactions on Affective Computing, 11:200–213.
- Comparing two subjective rating scales assessing cognitive load during technology-enhanced stem laboratory courses. Frontiers in Education, 6.
- Multimodal Transformer for Unaligned Multimodal Language Sequences. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6558–6569, Florence, Italy. Association for Computational Linguistics.
- Towards measuring cognitive load through multimodal physiological data. Cognition, Technology & Work, 23:567–585.
- Can cnns be more robust than transformers?
- Cognitive load theory: Implications for medical education: Amee guide no. 86. Medical Teacher, 36(5):371–384.
- Photoplethysmogram-based cognitive load assessment using multi-feature fusion model. ACM Transactions on Applied Perception (TAP), 16:1 – 17.
- Task similarity estimation through adversarial multitask neural network. IEEE Transactions on Neural Networks and Learning Systems, 32:466–480.