Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network (2012.11174v1)

Published 21 Dec 2020 in eess.AS and cs.AI

Abstract: By using deep learning approaches, Speech Emotion Recog-nition (SER) on a single domain has achieved many excellentresults. However, cross-domain SER is still a challenging taskdue to the distribution shift between source and target domains.In this work, we propose a Domain Adversarial Neural Net-work (DANN) based approach to mitigate this distribution shiftproblem for cross-lingual SER. Specifically, we add a languageclassifier and gradient reversal layer after the feature extractor toforce the learned representation both language-independent andemotion-meaningful. Our method is unsupervised, i. e., labelson target language are not required, which makes it easier to ap-ply our method to other languages. Experimental results showthe proposed method provides an average absolute improve-ment of 3.91% over the baseline system for arousal and valenceclassification task. Furthermore, we find that batch normaliza-tion is beneficial to the performance gain of DANN. Thereforewe also explore the effect of different ways of data combinationfor batch normalization.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xiong Cai (3 papers)
  2. Zhiyong Wu (171 papers)
  3. Kuo Zhong (2 papers)
  4. Bin Su (3 papers)
  5. Dongyang Dai (9 papers)
  6. Helen Meng (204 papers)
Citations (15)

Summary

We haven't generated a summary for this paper yet.