Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Increase Apparent Public Speaking Fluency By Speech Augmentation (1812.03415v2)

Published 9 Dec 2018 in cs.SD and eess.AS

Abstract: Fluent and confident speech is desirable to every speaker. But professional speech delivering requires a great deal of experience and practice. In this paper, we propose a speech stream manipulation system which can help non-professional speakers to produce fluent, professional-like speech content, in turn contributing towards better listener engagement and comprehension. We propose to achieve this task by manipulating the disfluencies in human speech, like the sounds 'uh' and 'um', the filler words and awkward long silences. Given any unrehearsed speech we segment and silence the filled pauses and doctor the duration of imposed silence as well as other long pauses ('disfluent') by a predictive model learned using professional speech dataset. Finally, we output a audio stream in which speaker sounds more fluent, confident and practiced compared to the original speech he/she recorded. According to our quantitative evaluation, we significantly increase the fluency of speech by reducing rate of pauses and fillers.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Sagnik Das (9 papers)
  2. Nisha Gandhi (1 paper)
  3. Tejas Naik (1 paper)
  4. Roy Shilkrot (3 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.