Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 35 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 440 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Multitask frame-level learning for few-shot sound event detection (2403.11091v1)

Published 17 Mar 2024 in cs.SD, cs.CV, and eess.AS

Abstract: This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been proposed to overcome these limitations, these strategies commonly face difficulties with prediction truncation caused by background noise. To alleviate this issue, we introduces an innovative multitask frame-level SED framework. In addition, we introduce TimeFilterAug, a linear timing mask for data augmentation, to increase the model's robustness and adaptability to diverse acoustic environments. The proposed method achieves a F-score of 63.8%, securing the 1st rank in the few-shot bioacoustic event detection category of the Detection and Classification of Acoustic Scenes and Events Challenge 2023.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Thi Ngoc Tho Nguyen, Karn N. Watcharasupat, “Salsa: Spatial cue-augmented log-spectrogram features for polyphonic sound event localization and detection,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1749–1762, 2022.
  2. “Sound event detection in synthetic domestic environments,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 86–90.
  3. Liwen You, Erika Pelaez Coyotl, “Transformer-based bioacoustic sound event detection on few-shot learning tasks,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5.
  4. “Listening for sirens: Locating and classifying acoustic alarms in city scenes,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 10, pp. 17087–17096, 2022.
  5. “Anomalous sound detection based on interpolation deep neural network,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 271–275.
  6. “Sound event detection in the dcase 2017 challenge,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 6, pp. 992–1006, 2019.
  7. “Acoustic event detection in real life recordings,” in 2010 18th European signal processing conference. IEEE, 2010, pp. 1267–1271.
  8. “Few-shot sound event detection,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 81–85.
  9. “Few-shot sound event detection,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 81–85.
  10. Vanderbrug and Rosenfeld, “Two-stage template matching,” IEEE transactions on computers, vol. 100, no. 4, pp. 384–393, 1977.
  11. “Few-shot bioacoustic event detection at the dcase 2023 challenge,” 2023.
  12. “Prototypical networks for few-shot learning,” Advances in neural information processing systems, vol. 30, 2017.
  13. Tiantian Tang and Liang, “Two improved architectures based on prototype network for few-shot bioacoustic event detection,” DCASE Challenge, 2021.
  14. “Bioacoustic event detection with prototypical networks and data augmentation,” arXiv preprint arXiv:2112.09006, 2021.
  15. “Information maximization for few-shot learning,” Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 2445–2457, 2020.
  16. Dongchao Yang, Helin Wang, “A mutual learning framework for few-shot sound event detection,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022, pp. 811–815.
  17. “Frame-level embedding learning for few-shot bioacoustic event detection,” in IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2023, pp. 750–755.
  18. “Learnable frontends that do not learn: Quantifying sensitivity to filterbank initialisation,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5.
  19. “Few-shot bioacoustic event detection,” Tech. Rep., DCASE2023 Challenge, June 2023.
  20. “Few-shot bioacoustic detection boosting with fine tuning strategy using negative based prototypical learning,” Tech. Rep., DCASE2023 Challenge, June 2023.
  21. “Few-shot bioacoustic event detection using beats,” Tech. Rep., DCASE2023 Challenge, June 2023.
  22. “Se-protonet: Prototypical network with squeeze-and-excitation blocks for bioacoustic event detection,” Tech. Rep., DCASE2023 Challenge, June 2023.
  23. “Supervised contrastive learning for pre-training bioacoustic few shot systems,” Tech. Rep., DCASE2023 Challenge, June 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 2 likes.

Upgrade to Pro to view all of the tweets about this paper:

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube