Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection (2310.15752v1)

Published 24 Oct 2023 in cs.CL and cs.AI

Abstract: When translating words referring to the speaker, speech translation (ST) systems should not resort to default masculine generics nor rely on potentially misleading vocal traits. Rather, they should assign gender according to the speakers' preference. The existing solutions to do so, though effective, are hardly feasible in practice as they involve dedicated model re-training on gender-labeled ST data. To overcome these limitations, we propose the first inference-time solution to control speaker-related gender inflections in ST. Our approach partially replaces the (biased) internal LLM (LM) implicitly learned by the ST decoder with gender-specific external LMs. Experiments on en->es/fr/it show that our solution outperforms the base models and the best training-time mitigation strategy by up to 31.0 and 1.6 points in gender accuracy, respectively, for feminine forms. The gains are even larger (up to 32.0 and 3.4) in the challenging condition where speakers' vocal traits conflict with their gender.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Dennis Fucci (11 papers)
  2. Marco Gaido (47 papers)
  3. Sara Papi (33 papers)
  4. Mauro Cettolo (20 papers)
  5. Matteo Negri (93 papers)
  6. Luisa Bentivogli (38 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.