Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Monaural Multi-Speaker Speech Separation Using Efficient Transformer Model (2308.00010v1)

Published 29 Jul 2023 in cs.SD, cs.LG, and eess.AS

Abstract: Cocktail party problem is the scenario where it is difficult to separate or distinguish individual speaker from a mixed speech from several speakers. There have been several researches going on in this field but the size and complexity of the model is being traded off with the accuracy and robustness of speech separation. "Monaural multi-speaker speech separation" presents a speech-separation model based on the Transformer architecture and its efficient forms. The model has been trained with the LibriMix dataset containing diverse speakers' utterances. The model separates 2 distinct speaker sources from a mixed audio input. The developed model approaches the reduction in computational complexity of the speech separation model, with minimum tradeoff with the performance of prevalent speech separation model and it has shown significant movement towards that goal. This project foresees, a rise in contribution towards the ongoing research in the field of speech separation with computational efficiency at its core.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. S. Rijal (1 paper)
  2. R. Neupane (1 paper)
  3. S. P. Mainali (1 paper)
  4. S. K. Regmi (1 paper)
  5. S. Maharjan (1 paper)

Summary

We haven't generated a summary for this paper yet.