Dice Question Streamline Icon: https://streamlinehq.com

Assess MBR decoding on real-world noisy speech from diverse communities and regions

Evaluate Minimum Bayes Risk decoding for automatic speech recognition using real-world noisy speech datasets drawn from diverse communities and regions to determine robustness beyond MUSAN-based synthetic noise.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper induces noise using the MUSAN corpus to paper robustness, but MUSAN may not capture the full variability and characteristics of real-world noise in different locales and environments.

The authors explicitly note that evaluation on real-world noisy datasets for particular communities and regions is future work, leaving open how MBR performs under authentic noise conditions.

References

Evaluation using real-world noisy datasets for the particular communities and regions is left for future work.

Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition (2510.19471 - Jinnai, 22 Oct 2025) in Section 6 (Limitations)