A Novel Scheme to classify Read and Spontaneous Speech (2306.08012v1)
Abstract: The COVID-19 pandemic has led to an increased use of remote telephonic interviews, making it important to distinguish between scripted and spontaneous speech in audio recordings. In this paper, we propose a novel scheme for identifying read and spontaneous speech. Our approach uses a pre-trained DeepSpeech audio-to-alphabet recognition engine to generate a sequence of alphabets from the audio. From these alphabets, we derive features that allow us to discriminate between read and spontaneous speech. Our experimental results show that even a small set of self-explanatory features can effectively classify the two types of speech very effectively.
- Bradlow, A.R.: ALLSSTAR: archive of L1 and L2 scripted and spontaneous transcripts and recordings. https://speechbox.linguistics.northwestern.edu/ (2023)
- Huggingface: speaker-diarization. https://huggingface.co/pyannote/speaker-diarization (pyannote/speaker-diarization@2022072, 2022)
- Kopparapu, S.K.: Air-DB: A dataset for classifying spontaneous and read speech. https://drive.google.com/drive/folders/1-31cSrppLGiG1bgij6rFlnw5rChifi5C?usp=sharing (2022)
- Mozilla: Deepspeech. https://github. com/mozilla/DeepSpeech/releases (Jan 2019)
- PrasarBharati: All India Radio. https://newsonair.gov.in/ (2022)
- Ward, W.: Understanding spontaneous speech. Speech and Natural Language Workshop pp. 365–367 (1989)
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.