Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Expressive Machine Dubbing Through Phrase-level Cross-lingual Prosody Transfer (2306.11662v2)

Published 20 Jun 2023 in eess.AS

Abstract: Speech generation for machine dubbing adds complexity to conventional Text-To-Speech solutions as the generated output is required to match the expressiveness, emotion and speaking rate of the source content. Capturing and transferring details and variations in prosody is a challenge. We introduce phrase-level cross-lingual prosody transfer for expressive multi-lingual machine dubbing. The proposed phrase-level prosody transfer delivers a significant 6.2% MUSHRA score increase over a baseline with utterance-level global prosody transfer, thereby closing the gap between the baseline and expressive human dubbing by 23.2%, while preserving intelligibility of the synthesised speech.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Duo Wang (47 papers)
  2. Mikolaj Babianski (3 papers)
  3. Giuseppe Coccia (2 papers)
  4. Patrick Lumban Tobing (20 papers)
  5. Ravichander Vipperla (6 papers)
  6. Viacheslav Klimkov (10 papers)
  7. Vincent Pollet (4 papers)
  8. Jakub Swiatkowski (4 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.