Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CTC Alignments Improve Autoregressive Translation (2210.05200v1)

Published 11 Oct 2022 in cs.CL, cs.SD, and eess.AS

Abstract: Connectionist Temporal Classification (CTC) is a widely used approach for automatic speech recognition (ASR) that performs conditionally independent monotonic alignment. However for translation, CTC exhibits clear limitations due to the contextual and non-monotonic nature of the task and thus lags behind attentional decoder approaches in terms of translation quality. In this work, we argue that CTC does in fact make sense for translation if applied in a joint CTC/attention framework wherein CTC's core properties can counteract several key weaknesses of pure-attention models during training and decoding. To validate this conjecture, we modify the Hybrid CTC/Attention model originally proposed for ASR to support text-to-text translation (MT) and speech-to-text translation (ST). Our proposed joint CTC/attention models outperform pure-attention baselines across six benchmark translation tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Brian Yan (40 papers)
  2. Siddharth Dalmia (36 papers)
  3. Yosuke Higuchi (23 papers)
  4. Graham Neubig (342 papers)
  5. Florian Metze (79 papers)
  6. Alan W Black (83 papers)
  7. Shinji Watanabe (416 papers)
Citations (31)