Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Infusing Future Information into Monotonic Attention Through Language Models (2109.03121v1)

Published 7 Sep 2021 in cs.CL

Abstract: Simultaneous neural machine translation(SNMT) models start emitting the target sequence before they have processed the source sequence. The recent adaptive policies for SNMT use monotonic attention to perform read/write decisions based on the partial source and target sequences. The lack of sufficient information might cause the monotonic attention to take poor read/write decisions, which in turn negatively affects the performance of the SNMT model. On the other hand, human translators make better read/write decisions since they can anticipate the immediate future words using linguistic information and domain knowledge.Motivated by human translators, in this work, we propose a framework to aid monotonic attention with an external LLM to improve its decisions.We conduct experiments on the MuST-C English-German and English-French speech-to-text translation tasks to show the effectiveness of the proposed framework.The proposed SNMT method improves the quality-latency trade-off over the state-of-the-art monotonic multihead attention.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mohd Abbas Zaidi (6 papers)
  2. Sathish Indurthi (4 papers)
  3. Beomseok Lee (7 papers)
  4. Nikhil Kumar Lakumarapu (3 papers)
  5. Sangha Kim (8 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.