Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Contrastive Language-Image Pre-training for the Italian Language (2108.08688v1)

Published 19 Aug 2021 in cs.CL and cs.CV

Abstract: CLIP (Contrastive Language-Image Pre-training) is a very recent multi-modal model that jointly learns representations of images and texts. The model is trained on a massive amount of English data and shows impressive performance on zero-shot classification tasks. Training the same model on a different language is not trivial, since data in other languages might be not enough and the model needs high-quality translations of the texts to guarantee a good performance. In this paper, we present the first CLIP model for the Italian Language (CLIP-Italian), trained on more than 1.4 million image-text pairs. Results show that CLIP-Italian outperforms the multilingual CLIP model on the tasks of image retrieval and zero-shot classification.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Federico Bianchi (47 papers)
  2. Giuseppe Attanasio (21 papers)
  3. Raphael Pisoni (3 papers)
  4. Silvia Terragni (8 papers)
  5. Gabriele Sarti (21 papers)
  6. Sri Lakshmi (1 paper)
Citations (29)