Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Image search using multilingual texts: a cross-modal learning approach between image and text (1903.11299v3)

Published 27 Mar 2019 in cs.CV and cs.CL

Abstract: Multilingual (or cross-lingual) embeddings represent several languages in a unique vector space. Using a common embedding space enables for a shared semantic between words from different languages. In this paper, we propose to embed images and texts into a unique distributional vector space, enabling to search images by using text queries expressing information needs related to the (visual) content of images, as well as using image similarity. Our framework forces the representation of an image to be similar to the representation of the text that describes it. Moreover, by using multilingual embeddings we ensure that words from two different languages have close descriptors and thus are attached to similar images. We provide experimental evidence of the efficiency of our approach by experimenting it on two datasets: Common Objects in COntext (COCO) [19] and Multi30K [7].

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Maxime Portaz (2 papers)
  2. Hicham Randrianarivo (4 papers)
  3. Adrien Nivaggioli (5 papers)
  4. Estelle Maudet (2 papers)
  5. Christophe Servan (16 papers)
  6. Sylvain Peyronnet (4 papers)
Citations (12)