Multitask frame-level learning for few-shot sound event detection (2403.11091v1)
Abstract: This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been proposed to overcome these limitations, these strategies commonly face difficulties with prediction truncation caused by background noise. To alleviate this issue, we introduces an innovative multitask frame-level SED framework. In addition, we introduce TimeFilterAug, a linear timing mask for data augmentation, to increase the model's robustness and adaptability to diverse acoustic environments. The proposed method achieves a F-score of 63.8%, securing the 1st rank in the few-shot bioacoustic event detection category of the Detection and Classification of Acoustic Scenes and Events Challenge 2023.
- Thi Ngoc Tho Nguyen, Karn N. Watcharasupat, “Salsa: Spatial cue-augmented log-spectrogram features for polyphonic sound event localization and detection,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1749–1762, 2022.
- “Sound event detection in synthetic domestic environments,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 86–90.
- Liwen You, Erika Pelaez Coyotl, “Transformer-based bioacoustic sound event detection on few-shot learning tasks,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5.
- “Listening for sirens: Locating and classifying acoustic alarms in city scenes,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 10, pp. 17087–17096, 2022.
- “Anomalous sound detection based on interpolation deep neural network,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 271–275.
- “Sound event detection in the dcase 2017 challenge,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 6, pp. 992–1006, 2019.
- “Acoustic event detection in real life recordings,” in 2010 18th European signal processing conference. IEEE, 2010, pp. 1267–1271.
- “Few-shot sound event detection,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 81–85.
- “Few-shot sound event detection,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 81–85.
- Vanderbrug and Rosenfeld, “Two-stage template matching,” IEEE transactions on computers, vol. 100, no. 4, pp. 384–393, 1977.
- “Few-shot bioacoustic event detection at the dcase 2023 challenge,” 2023.
- “Prototypical networks for few-shot learning,” Advances in neural information processing systems, vol. 30, 2017.
- Tiantian Tang and Liang, “Two improved architectures based on prototype network for few-shot bioacoustic event detection,” DCASE Challenge, 2021.
- “Bioacoustic event detection with prototypical networks and data augmentation,” arXiv preprint arXiv:2112.09006, 2021.
- “Information maximization for few-shot learning,” Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 2445–2457, 2020.
- Dongchao Yang, Helin Wang, “A mutual learning framework for few-shot sound event detection,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022, pp. 811–815.
- “Frame-level embedding learning for few-shot bioacoustic event detection,” in IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2023, pp. 750–755.
- “Learnable frontends that do not learn: Quantifying sensitivity to filterbank initialisation,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5.
- “Few-shot bioacoustic event detection,” Tech. Rep., DCASE2023 Challenge, June 2023.
- “Few-shot bioacoustic detection boosting with fine tuning strategy using negative based prototypical learning,” Tech. Rep., DCASE2023 Challenge, June 2023.
- “Few-shot bioacoustic event detection using beats,” Tech. Rep., DCASE2023 Challenge, June 2023.
- “Se-protonet: Prototypical network with squeeze-and-excitation blocks for bioacoustic event detection,” Tech. Rep., DCASE2023 Challenge, June 2023.
- “Supervised contrastive learning for pre-training bioacoustic few shot systems,” Tech. Rep., DCASE2023 Challenge, June 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.