Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Emotion Recognition Using Speaker Cues (2002.03566v1)

Published 4 Feb 2020 in eess.AS, cs.LG, and cs.SD

Abstract: This research aims at identifying the unknown emotion using speaker cues. In this study, we identify the unknown emotion using a two-stage framework. The first stage focuses on identifying the speaker who uttered the unknown emotion, while the next stage focuses on identifying the unknown emotion uttered by the recognized speaker in the prior stage. This proposed framework has been evaluated on an Arabic Emirati-accented speech database uttered by fifteen speakers per gender. Mel-Frequency Cepstral Coefficients (MFCCs) have been used as the extracted features and Hidden Markov Model (HMM) has been utilized as the classifier in this work. Our findings demonstrate that emotion recognition accuracy based on the two-stage framework is greater than that based on the one-stage approach and the state-of-the-art classifiers and models such as Gaussian Mixture Model (GMM), Support Vector Machine (SVM), and Vector Quantization (VQ). The average emotion recognition accuracy based on the two-stage approach is 67.5%, while the accuracy reaches to 61.4%, 63.3%, 64.5%, and 61.5%, based on the one-stage approach, GMM, SVM, and VQ, respectively. The achieved results based on the two-stage framework are very close to those attained in subjective assessment by human listeners.

Citations (2)

Summary

We haven't generated a summary for this paper yet.