Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Textless Metric for Speech-to-Speech Comparison (2210.11835v2)

Published 21 Oct 2022 in cs.CL, cs.SD, and eess.AS

Abstract: In this paper, we introduce a new and simple method for comparing speech utterances without relying on text transcripts. Our speech-to-speech comparison metric utilizes state-of-the-art speech2unit encoders like HuBERT to convert speech utterances into discrete acoustic units. We then propose a simple and easily replicable neural architecture that learns a speech-based metric that closely corresponds to its text-based counterpart. This textless metric has numerous potential applications, including evaluating speech-to-speech translation for oral languages, languages without dependable ASR systems, or to avoid the need for ASR transcription altogether. This paper also shows that for speech-to-speech translation evaluation, ASR-BLEU (which consists in automatically transcribing both speech hypothesis and reference and compute sentence-level BLEU between transcripts) is a poor proxy to real text-BLEU even when ASR system is strong.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Laurent Besacier (76 papers)
  2. Swen Ribeiro (1 paper)
  3. Olivier Galibert (4 papers)
  4. Ioan Calapodescu (12 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.