Onset and offset weighted loss function for sound event detection (2403.13254v1)

Published 20 Mar 2024 in cs.SD and eess.AS

Abstract: In a typical sound event detection (SED) system, the existence of a sound event is detected at a frame level, and consecutive frames with the same event detected are combined as one sound event. The median filter is applied as a post-processing step to remove detection errors as much as possible. However, detection errors occurring around the onset and offset of a sound event are beyond the capacity of the median filter. To address this issue, an onset and offset weighted binary cross-entropy (OWBCE) loss function is proposed in this paper, which trains the DNN model to be more robust on frames around (a) onsets and offsets. Experiments are carried out in the context of DCASE 2022 task 4. Results show that OWBCE outperforms BCE when different models are considered. For a basic CRNN, relative improvements of 6.43% in event-F1, 1.96% in PSDS1, and 2.43% in PSDS2 can be achieved by OWBCE.

References (17)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/ArxivSound/status/1770662491262959874

https://twitter.com/AudioAndSpeech/status/1770763836028543255

https://twitter.com/ResearcherSays/status/1777616476167381253

Onset and offset weighted loss function for sound event detection (2403.13254v1)

Summary

Related Papers

Tweets