Post-Processing Independent Evaluation of Sound Event Detection Systems (2306.15440v1)
Abstract: Due to the high variation in the application requirements of sound event detection (SED) systems, it is not sufficient to evaluate systems only in a single operating mode. Therefore, the community recently adopted the polyphonic sound detection score (PSDS) as an evaluation metric, which is the normalized area under the PSD receiver operating characteristic (PSD-ROC). It summarizes the system performance over a range of operating modes resulting from varying the decision threshold that is used to translate the system output scores into a binary detection output. Hence, it provides a more complete picture of the overall system behavior and is less biased by specific threshold tuning. However, besides the decision threshold there is also the post-processing that can be changed to enter another operating mode. In this paper we propose the post-processing independent PSDS (piPSDS) as a generalization of the PSDS. Here, the post-processing independent PSD-ROC includes operating points from varying post-processings with varying decision thresholds. Thus, it summarizes even more operating modes of an SED system and allows for system comparison without the need of implementing a post-processing and without a bias due to different post-processings. While piPSDS can in principle combine different types of post-processing, we hear, as a first step, present median filter independent PSDS (miPSDS) results for this year's DCASE Challenge Task4a systems. Source code is publicly available in our sed_scores_eval package (https://github.com/fgnt/sed_scores_eval).
- A. Mesaros, T. Heittola, T. Virtanen, and M. D. Plumbley, “Sound event detection: A tutorial,” IEEE Signal Processing Magazine, vol. 38, no. 5, pp. 67–83, 2021.
- Y. Wang, J. Li, and F. Metze, “A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2019, pp. 31–35.
- K. Miyazaki, T. Komatsu, T. Hayashi, S. Watanabe, T. Toda, and K. Takeda, “Weakly-supervised sound event detection with self-attention,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2020, pp. 66–70.
- L. JiaKai, “Mean teacher convolution system for dcase 2018 task 4,” Detection and Classification of Acoustic Scenes and Events Challenge, Tech. Rep., September 2018.
- J. Ebbers and R. Haeb-Umbach, “Pre-training and self-training for sound event detection in domestic environments,” DCASE2022 Challenge, Tech. Rep., June 2022.
- DCASE 2023 Challenge Task 4a description. [Online]. Available: https://dcase.community/challenge2023/task-sound-event-detection-with-weak-labels-and-synthetic-soundscapes
- G. Ferroni, N. Turpault, J. Azcarreta, F. Tuveri, R. Serizel, Ç. Bilen, and S. Krstulović, “Improving sound event detection metrics: insights from dcase 2020,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2021, pp. 631–635.
- A. Mesaros, T. Heittola, and T. Virtanen, “Metrics for polyphonic sound event detection,” Applied Sciences, vol. 6, no. 6, p. 162, 2016.
- Ç. Bilen, G. Ferroni, F. Tuveri, J. Azcarreta, and S. Krstulović, “A framework for the robust evaluation of sound event detection,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2020, pp. 61–65.
- J. Ebbers, R. Haeb-Umbach, and R. Serizel, “Threshold independent evaluation of sound event detection scores,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2022, pp. 1021–1025.
- J. Davis and M. Goadrich, “The relationship between precision-recall and roc curves,” in Proc. 23rd international conference on Machine learning. ACM Press, 2006, pp. 233–240.
- DCASE 2023 Challenge Task 4a baseline. [Online]. Available: https://github.com/DCASE-REPO/DESED˙task/tree/master/recipes/dcase2023˙task4˙baseline
- N. Turpault, R. Serizel, A. P. Shah, and J. Salamon, “Sound event detection in domestic environments with weakly labeled data and soundscape synthesis,” in Proc. Workshop on Detection and Classification of Acoustic Scenes and Events, 2019.
- S. Barahona, D. de Benito-Gorron, S. Segovia, D. Ramos, and D. Toledano, “Optimizing multi-resolution conformer and crnn models for different PSDS scenarios in DCASE challenge 2023 task 4a,” DCASE2023 Challenge, Tech. Rep., June 2023.
- G.-A. Cheimariotis and N. Mitianoudis, “Sound event detection of domestic activities using frequency dynamic convolution and BEATS embeddings,” DCASE2023 Challenge, Tech. Rep., June 2023.
- W.-Y. Chen, C.-L. Lu, H.-F. Chuang, Y.-H. C. Cheng, and B.-C. Chan, “Sound event detection system using pre-trained model for dcase 2023 task 4,” DCASE2023 Challenge, Tech. Rep., June 2023.
- Y. Guan and Q. Shang, “Semi-supervised sound event detection system for DCASE 2023 task 4,” DCASE2023 Challenge, Tech. Rep., June 2023.
- J. W. Kim, S. W. Son, Y. Song, . Kim, Hong Kook1, I. H. Song, and J. E. Lim, “Semi-supervised learning-based sound event detection using frequency dynamic convolution with large kernel attention for DCASE challenge 2023 task 4,” DCASE2023 Challenge, Tech. Rep., June 2023.
- S. Lee, N. Kim, J. Lee, C. Hwang, S. Jang, and I.-Y. Kwak, “Sound event detection using convolution attention module for DCASE 2023 challenge task4a,” DCASE2023 Challenge, Tech. Rep., June 2023.
- K. Li, P. Cai, and Y. Song, “Li USTC team’s submission for DCASE 2023 challenge task4a,” DCASE2023 Challenge, Tech. Rep., June 2023.
- C.-C. Liu, T.-H. Kuo, C.-P. Chen, C.-L. Lu, B.-C. Chan, Y.-H. Cheng, and H.-F. Chuang, “Cht+nsysu sound event detection system with pretrained embeddings extracted from beats model for dcase 2023 task 4,” DCASE2023 Challenge, Tech. Rep., June 2023.
- M. Chen, Y. Jin, J. Shao, Y. Liu, B. Peng, and J. Chen, “DCASE 2023 challenge task4 technical report,” DCASE2023 Challenge, Tech. Rep., June 2023.
- Y. Wang, H. Dinkel, Z. Yan, J. Zhang, and Y. Wang, “Pepe: Plain efficient pretrained embeddings for sound event detection,” DCASE2023 Challenge, Tech. Rep., June 2023.
- X. Duo, Wenxin1 Fang and J. Li, “Semi-supervised sound event detection system for DCASE 2023 task4a,” DCASE2023 Challenge, Tech. Rep., June 2023.
- Y. Xiao, T. Khandelwal, and R. K. Das, “FMSG submission for DCASE 2023 challenge task 4 on sound event detection with weak labels and synthetic soundscapes,” DCASE2023 Challenge, Tech. Rep., June 2023.