Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Source separation with weakly labelled data: An approach to computational auditory scene analysis (2002.02065v1)

Published 6 Feb 2020 in cs.SD and eess.AS

Abstract: Source separation is the task to separate an audio recording into individual sound sources. Source separation is fundamental for computational auditory scene analysis. Previous work on source separation has focused on separating particular sound classes such as speech and music. Many of previous work require mixture and clean source pairs for training. In this work, we propose a source separation framework trained with weakly labelled data. Weakly labelled data only contains the tags of an audio clip, without the occurrence time of sound events. We first train a sound event detection system with AudioSet. The trained sound event detection system is used to detect segments that are mostly like to contain a target sound event. Then a regression is learnt from a mixture of two randomly selected segments to a target segment conditioned on the audio tagging prediction of the target segment. Our proposed system can separate 527 kinds of sound classes from AudioSet within a single system. A U-Net is adopted for the separation system and achieves an average SDR of 5.67 dB over 527 sound classes in AudioSet.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Qiuqiang Kong (86 papers)
  2. Yuxuan Wang (239 papers)
  3. Xuchen Song (20 papers)
  4. Yin Cao (24 papers)
  5. Wenwu Wang (148 papers)
  6. Mark D. Plumbley (114 papers)
Citations (46)

Summary

We haven't generated a summary for this paper yet.