Dice Question Streamline Icon: https://streamlinehq.com

Develop a fast implementation of Minimum Bayes Risk decoding

Develop a computationally efficient implementation of sample-based Minimum Bayes Risk decoding that substantially reduces walltime while preserving accuracy in speech-to-text applications.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper reports that MBR decoding improves accuracy over beam search but incurs significant computational cost, with walltime dominated by utility computation and hypothesis sampling. Although faster MBR algorithms exist in the literature and Whisper has optimized inference libraries (e.g., faster-whisper, whisper.cpp), integrating these optimizations for MBR requires additional engineering.

The authors explicitly state that creating a fast implementation is future work, highlighting a practical barrier to MBR adoption in latency-sensitive scenarios.

References

Developing a fast implementation of MBR decoding is left for future work.

Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition (2510.19471 - Jinnai, 22 Oct 2025) in Section 6 (Limitations)