Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multichannel Singing Voice Separation by Deep Neural Network Informed DOA Constrained CNMF (2003.01162v1)

Published 2 Mar 2020 in eess.AS and cs.SD

Abstract: This work addresses the problem of multichannel source separation combining two powerful approaches, multichannel spectral factorization with recent monophonic deep-learning (DL) based spectrum inference. Individual source spectra at different channels are estimated with a Masker-Denoiser Twin Network (MaD TwinNet), able to model long-term temporal patterns of a musical piece. The monophonic source spectrograms are used within a spatial covariance mixing model based on Complex Non-Negative Matrix Factorization (CNMF) that predicts the spatial characteristics of each source. The proposed framework is evaluated on the task of singing voice separation with a large multichannel dataset. Experimental results show that our joint DL+CNMF method outperforms both the individual monophonic DL-based separation and the multichannel CNMF baseline methods.

Summary

We haven't generated a summary for this paper yet.