Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Multimodal Simultaneous Neural Machine Translation (2004.03180v2)

Published 7 Apr 2020 in cs.CL

Abstract: Simultaneous translation involves translating a sentence before the speaker's utterance is completed in order to realize real-time understanding in multiple languages. This task is significantly more challenging than the general full sentence translation because of the shortage of input information during decoding. To alleviate this shortage, we propose multimodal simultaneous neural machine translation (MSNMT), which leverages visual information as an additional modality. Our experiments with the Multi30k dataset showed that MSNMT significantly outperforms its text-only counterpart in more timely translation situations with low latency. Furthermore, we verified the importance of visual information during decoding by performing an adversarial evaluation of MSNMT, where we studied how models behaved with incongruent input modality and analyzed the effect of different word order between source and target languages.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Aizhan Imankulova (6 papers)
  2. Masahiro Kaneko (46 papers)
  3. Tosho Hirasawa (8 papers)
  4. Mamoru Komachi (40 papers)
Citations (15)

Summary

We haven't generated a summary for this paper yet.