Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel (2108.08663v2)

Published 19 Aug 2021 in eess.AS

Abstract: Speech Emotion Recognition (SER) in a single language has achieved remarkable results through deep learning approaches in the last decade. However, cross-lingual SER remains a challenge in real-world applications due to a great difference between the source and target domain distributions. To address this issue, we propose an unsupervised cross-lingual Neural Network with Pseudo Multilabel (NNPM) that is trained to learn the emotion similarities between source domain features inside an external memory adjusted to identify emotion in cross-lingual databases. NNPM introduces a novel approach that leverages external memory to store source domain features and generates pseudo multilabel for each target domain data by computing the similarities between the external memory and the target domain features. We evaluate our approach on multiple different languages of speech emotion databases. Experimental results show our proposed approach significantly improves the weighted accuracy (WA) across multiple low-resource languages on Urdu, Skropus, ShEMO, and EMO-DB corpus. To facilitate further research, code is available at https://github.com/happyjin/NNPM

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Jin Li (366 papers)
  2. Nan Yan (11 papers)
  3. Lan Wang (113 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.