Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Training sound event detection with soft labels from crowdsourced annotations (2302.14572v1)

Published 28 Feb 2023 in eess.AS and cs.SD

Abstract: In this paper, we study the use of soft labels to train a system for sound event detection (SED). Soft labels can result from annotations which account for human uncertainty about categories, or emerge as a natural representation of multiple opinions in annotation. Converting annotations to hard labels results in unambiguous categories for training, at the cost of losing the details about the labels distribution. This work investigates how soft labels can be used, and what benefits they bring in training a SED system. The results show that the system is capable of learning information about the activity of the sounds which is reflected in the soft labels and is able to detect sounds that are missed in the typical binary target training setup. We also release a new dataset produced through crowdsourcing, containing temporally strong labels for sound events in real-life recordings, with both soft and hard labels.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Irene Martín-Morató (10 papers)
  2. Manu Harju (5 papers)
  3. Paul Ahokas (1 paper)
  4. Annamaria Mesaros (29 papers)
Citations (15)

Summary

We haven't generated a summary for this paper yet.