Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multimodal Attention for Neural Machine Translation (1609.03976v1)

Published 13 Sep 2016 in cs.CL and cs.NE

Abstract: The attention mechanism is an important part of the neural machine translation (NMT) where it was reported to produce richer source representation compared to fixed-length encoding sequence-to-sequence models. Recently, the effectiveness of attention has also been explored in the context of image captioning. In this work, we assess the feasibility of a multimodal attention mechanism that simultaneously focus over an image and its natural language description for generating a description in another language. We train several variants of our proposed attention mechanism on the Multi30k multilingual image captioning dataset. We show that a dedicated attention for each modality achieves up to 1.6 points in BLEU and METEOR compared to a textual NMT baseline.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Ozan Caglayan (20 papers)
  2. Loïc Barrault (34 papers)
  3. Fethi Bougares (18 papers)
Citations (74)

Summary

We haven't generated a summary for this paper yet.