Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Text-to-Image Synthesis Based on Machine Generated Captions (1910.04056v1)

Published 9 Oct 2019 in cs.LG, cs.CL, and stat.ML

Abstract: Text to Image Synthesis refers to the process of automatic generation of a photo-realistic image starting from a given text and is revolutionizing many real-world applications. In order to perform such process it is necessary to exploit datasets containing captioned images, meaning that each image is associated with one (or more) captions describing it. Despite the abundance of uncaptioned images datasets, the number of captioned datasets is limited. To address this issue, in this paper we propose an approach capable of generating images starting from a given text using conditional GANs trained on uncaptioned images dataset. In particular, uncaptioned images are fed to an Image Captioning Module to generate the descriptions. Then, the GAN Module is trained on both the input image and the machine-generated caption. To evaluate the results, the performance of our solution is compared with the results obtained by the unconditional GAN. For the experiments, we chose to use the uncaptioned dataset LSUN bedroom. The results obtained in our study are preliminary but still promising.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Marco Menardi (1 paper)
  2. Alex Falcon (10 papers)
  3. Saida S. Mohamed (1 paper)
  4. Lorenzo Seidenari (21 papers)
  5. Giuseppe Serra (39 papers)
  6. Alberto Del Bimbo (85 papers)
  7. Carlo Tasso (3 papers)