Papers
Topics
Authors
Recent
Search
2000 character limit reached

Language vs Speaker Change: A Comparative Study

Published 5 Mar 2022 in eess.AS, cs.SD, and eess.SP | (2203.02680v2)

Abstract: Spoken language change detection (LCD) refers to detecting language switching points in a multilingual speech signal. Speaker change detection (SCD) refers to locating the speaker change points in a multispeaker speech signal. The objective of this work is to understand the challenges in LCD task by comparing it with SCD task. Human subjective study for change detection is performed for LCD and SCD. This study demonstrates that LCD requires larger duration spectro-temporal information around the change point compared to SCD. Based on this, the work explores automatic distance based and model based LCD approaches. The model based ones include Gaussian mixture model and universal background model (GMM-UBM), attention, and Generative adversarial network (GAN) based approaches. Both the human and automatic LCD tasks infer that the performance of the LCD task improves by incorporating more and more spectro-temporal duration.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.