Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PoeticTTS -- Controllable Poetry Reading for Literary Studies (2207.05549v2)

Published 11 Jul 2022 in eess.AS, cs.CL, cs.LG, and cs.SD

Abstract: Speech synthesis for poetry is challenging due to specific intonation patterns inherent to poetic speech. In this work, we propose an approach to synthesise poems with almost human like naturalness in order to enable literary scholars to systematically examine hypotheses on the interplay between text, spoken realisation, and the listener's perception of poems. To meet these special requirements for literary studies, we resynthesise poems by cloning prosodic values from a human reference recitation, and afterwards make use of fine-grained prosody control to manipulate the synthetic speech in a human-in-the-loop setting to alter the recitation w.r.t. specific phenomena. We find that finetuning our TTS model on poetry captures poetic intonation patterns to a large extent which is beneficial for prosody cloning and manipulation and verify the success of our approach both in an objective evaluation as well as in human studies.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Julia Koch (9 papers)
  2. Florian Lux (17 papers)
  3. Nadja Schauffler (2 papers)
  4. Toni Bernhart (1 paper)
  5. Felix Dieterle (1 paper)
  6. Sandra Richter (1 paper)
  7. Gabriel Viehhauser (1 paper)
  8. Ngoc Thang Vu (93 papers)
  9. Jonas Kuhn (25 papers)
Citations (5)