Validate MetricX for ASR by establishing human-correlation
Determine the correlation between MetricX (metricx-23-xxl-v2p0) scores and human evaluations of automatic speech recognition outputs to assess the suitability of MetricX as a quality metric for ASR.
References
Its correlation with human evaluation for ASR task is unknown, yet given its splendid accuracy in machine translation, it would be a useful metric for ASR.
— Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition
(2510.19471 - Jinnai, 22 Oct 2025) in Section 4.1 (Automatic Speech Recognition), Evaluation metrics