Deep functional multiple index models with an application to SER (2403.17562v1)

Published 26 Mar 2024 in cs.SD, eess.AS, and stat.AP

Abstract: Speech Emotion Recognition (SER) plays a crucial role in advancing human-computer interaction and speech processing capabilities. We introduce a novel deep-learning architecture designed specifically for the functional data model known as the multiple-index functional model. Our key innovation lies in integrating adaptive basis layers and an automated data transformation search within the deep learning framework. Simulations for this new model show good performances. This allows us to extract features tailored for chunk-level SER, based on Mel Frequency Cepstral Coefficients (MFCCs). We demonstrate the effectiveness of our approach on the benchmark IEMOCAP database, achieving good performance compared to existing methods.

References (34)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/AudioAndSpeech/status/1772956157704872252

Deep functional multiple index models with an application to SER (2403.17562v1)

Summary

Related Papers

Tweets