Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dual Adversarial Inference for Text-to-Image Synthesis (1908.05324v1)

Published 14 Aug 2019 in cs.CV

Abstract: Synthesizing images from a given text description involves engaging two types of information: the content, which includes information explicitly described in the text (e.g., color, composition, etc.), and the style, which is usually not well described in the text (e.g., location, quantity, size, etc.). However, in previous works, it is typically treated as a process of generating images only from the content, i.e., without considering learning meaningful style representations. In this paper, we aim to learn two variables that are disentangled in the latent space, representing content and style respectively. We achieve this by augmenting current text-to-image synthesis frameworks with a dual adversarial inference mechanism. Through extensive experiments, we show that our model learns, in an unsupervised manner, style representations corresponding to certain meaningful information present in the image that are not well described in the text. The new framework also improves the quality of synthesized images when evaluated on Oxford-102, CUB and COCO datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Qicheng Lao (27 papers)
  2. Mohammad Havaei (31 papers)
  3. Ahmad Pesaranghader (5 papers)
  4. Francis Dutil (13 papers)
  5. Lisa Di Jorio (8 papers)
  6. Thomas Fevens (8 papers)
Citations (39)