Papers
Topics
Authors
Recent
Search
2000 character limit reached

Typical vs. Atypical Disfluency Classification: Introducing the IIITH-TISA Corpus and Temporal Context-Based Feature Representations

Published 26 Nov 2024 in eess.AS | (2411.17149v1)

Abstract: Speech disfluencies in spontaneous communication can be categorized as either typical or atypical. Typical disfluencies, such as hesitations and repetitions, are natural occurrences in everyday speech, while atypical disfluencies are indicative of pathological disorders like stuttering. Distinguishing between these categories is crucial for improving voice assistants (VAs) for Persons Who Stutter (PWS), who often face premature cutoffs due to misidentification of speech termination. Accurate classification also aids in detecting stuttering early in children, preventing misdiagnosis as language development disfluency. This research introduces the IIITH-TISA dataset, the first Indian English stammer corpus, capturing atypical disfluencies. Additionally, we extend the IIITH-IED dataset with detailed annotations for typical disfluencies. We propose Perceptually Enhanced Zero-Time Windowed Cepstral Coefficients (PE-ZTWCC) combined with Shifted Delta Cepstra (SDC) as input features to a shallow Time Delay Neural Network (TDNN) classifier, capturing both local and wider temporal contexts. Our method achieves an average F1 score of 85.01% for disfluency classification, outperforming traditional features.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.