Enhancing Stereo Sound Event Detection with BiMamba and Pretrained PSELDnet (2507.09570v1)

Published 13 Jul 2025 in eess.AS and cs.SD

Abstract: Pre-training methods have greatly improved the performance of sound event localization and detection (SELD). However, existing Transformer-based models still face high computational cost. To solve this problem, we present a stereo SELD system using a pre-trained PSELDnet and a bidirectional Mamba sequence model. Specifically, we replace the Conformer module with a BiMamba module. We also use asymmetric convolutions to better capture the time and frequency relationships in the audio signal. Test results on the DCASE2025 Task 3 development dataset show that our method performs better than both the baseline and the original PSELDnet with a Conformer decoder. In addition, the proposed model costs fewer computing resources than the baselines. These results show that the BiMamba architecture is effective for solving key challenges in SELD tasks. The source code is publicly accessible at https://github.com/ alexandergwm/DCASE2025 TASK3 Stereo PSELD Mamba.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Authors (2)

GitHub

GitHub · Build and ship software on a single, collaborative platform · GitHub

Enhancing Stereo Sound Event Detection with BiMamba and Pretrained PSELDnet (2507.09570v1)

Summary

Follow-up Questions

Related Papers

Authors (2)

GitHub