Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Customize StyleGAN with One Hand Sketch (2310.18949v1)

Published 29 Oct 2023 in cs.CV

Abstract: Generating images from human sketches typically requires dedicated networks trained from scratch. In contrast, the emergence of the pre-trained Vision-LLMs (e.g., CLIP) has propelled generative applications based on controlling the output imagery of existing StyleGAN models with text inputs or reference images. Parallelly, our work proposes a framework to control StyleGAN imagery with a single user sketch. In particular, we learn a conditional distribution in the latent space of a pre-trained StyleGAN model via energy-based learning and propose two novel energy functions leveraging CLIP for cross-domain semantic supervision. Once trained, our model can generate multi-modal images semantically aligned with the input sketch. Quantitative evaluations on synthesized datasets have shown that our approach improves significantly from previous methods in the one-shot regime. The superiority of our method is further underscored when experimenting with a wide range of human sketches of diverse styles and poses. Surprisingly, our models outperform the previous baseline regarding both the range of sketch inputs and image qualities despite operating with a stricter setting: with no extra training data and single sketch input.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Shaocong Zhang (1 paper)