Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Machine Translation with Latent Semantic of Image and Text (1611.08459v1)

Published 25 Nov 2016 in cs.CL

Abstract: Although attention-based Neural Machine Translation have achieved great success, attention-mechanism cannot capture the entire meaning of the source sentence because the attention mechanism generates a target word depending heavily on the relevant parts of the source sentence. The report of earlier studies has introduced a latent variable to capture the entire meaning of sentence and achieved improvement on attention-based Neural Machine Translation. We follow this approach and we believe that the capturing meaning of sentence benefits from image information because human beings understand the meaning of language not only from textual information but also from perceptual information such as that gained from vision. As described herein, we propose a neural machine translation model that introduces a continuous latent variable containing an underlying semantic extracted from texts and images. Our model, which can be trained end-to-end, requires image information only when training. Experiments conducted with an English--German translation task show that our model outperforms over the baseline.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Joji Toyama (1 paper)
  2. Masanori Misono (2 papers)
  3. Masahiro Suzuki (55 papers)
  4. Kotaro Nakayama (7 papers)
  5. Yutaka Matsuo (128 papers)
Citations (14)

Summary

We haven't generated a summary for this paper yet.