Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sentiment-Aware Automatic Speech Recognition pre-training for enhanced Speech Emotion Recognition (2201.11826v1)

Published 27 Jan 2022 in cs.CL, cs.SD, and eess.AS

Abstract: We propose a novel multi-task pre-training method for Speech Emotion Recognition (SER). We pre-train SER model simultaneously on Automatic Speech Recognition (ASR) and sentiment classification tasks to make the acoustic ASR model more ``emotion aware''. We generate targets for the sentiment classification using text-to-sentiment model trained on publicly available data. Finally, we fine-tune the acoustic ASR on emotion annotated speech data. We evaluated the proposed approach on the MSP-Podcast dataset, where we achieved the best reported concordance correlation coefficient (CCC) of 0.41 for valence prediction.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Ayoub Ghriss (3 papers)
  2. Bo Yang (427 papers)
  3. Viktor Rozgic (11 papers)
  4. Elizabeth Shriberg (6 papers)
  5. Chao Wang (555 papers)
Citations (20)

Summary

We haven't generated a summary for this paper yet.