Class-conditional embeddings for music source separation (1811.03076v1)

Published 7 Nov 2018 in cs.SD, cs.LG, eess.AS, and stat.ML

Abstract: Isolating individual instruments in a musical mixture has a myriad of potential applications, and seems imminently achievable given the levels of performance reached by recent deep learning methods. While most musical source separation techniques learn an independent model for each instrument, we propose using a common embedding space for the time-frequency bins of all instruments in a mixture inspired by deep clustering and deep attractor networks. Additionally, an auxiliary network is used to generate parameters of a Gaussian mixture model (GMM) where the posterior distribution over GMM components in the embedding space can be used to create a mask that separates individual sources from a mixture. In addition to outperforming a mask-inference baseline on the MUSDB-18 dataset, our embedding space is easily interpretable and can be used for query-based separation.

Citations (40)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Class-conditional embeddings for music source separation (1811.03076v1)

Summary

Related Papers