Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion (2210.13756v3)

Published 25 Oct 2022 in eess.AS and cs.SD

Abstract: Emotional voice conversion (EVC) traditionally targets the transformation of spoken utterances from one emotional state to another, with previous research mainly focusing on discrete emotion categories. This paper departs from the norm by introducing a novel perspective: a nuanced rendering of mixed emotions and enhancing control over emotional expression. To achieve this, we propose a novel EVC framework, Mixed-EVC, which only leverages discrete emotion training labels. We construct an attribute vector that encodes the relationships among these discrete emotions, which is predicted using a ranking-based support vector machine and then integrated into a sequence-to-sequence (seq2seq) EVC framework. Mixed-EVC not only learns to characterize the input emotional style but also quantifies its relevance to other emotions during training. As a result, users have the ability to assign these attributes to achieve their desired rendering of mixed emotions. Objective and subjective evaluations confirm the effectiveness of our approach in terms of mixed emotion synthesis and control while surpassing traditional baselines in the conversion of discrete emotions from one to another.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Kun Zhou (217 papers)
  2. Carlos Busso (25 papers)
  3. Bin Ma (78 papers)
  4. Haizhou Li (286 papers)
  5. Berrak Sisman (49 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.