Evaluating WER under selective prediction in long-form ASR
Determine a principled method to compute and report Word Error Rate (WER) for long-form automatic speech recognition when a subset of predicted words is intentionally ignored (for example, filtered out based on word-level uncertainty), enabling fair and meaningful evaluation under selective prediction settings.
References
However, in long-form speech recognition, it is not clear how to evaluate WER when ignoring some words.
— Pisets: A Robust Speech Recognition System for Lectures and Interviews
(2601.18415 - Bondarenko et al., 26 Jan 2026) in Uncertainty modeling metrics (Section: Uncertainty modeling)