Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Creative Sketch Generation (2011.10039v2)

Published 19 Nov 2020 in cs.CV and cs.AI

Abstract: Sketching or doodling is a popular creative activity that people engage in. However, most existing work in automatic sketch understanding or generation has focused on sketches that are quite mundane. In this work, we introduce two datasets of creative sketches -- Creative Birds and Creative Creatures -- containing 10k sketches each along with part annotations. We propose DoodlerGAN -- a part-based Generative Adversarial Network (GAN) -- to generate unseen compositions of novel part appearances. Quantitative evaluations as well as human studies demonstrate that sketches generated by our approach are more creative and of higher quality than existing approaches. In fact, in Creative Birds, subjects prefer sketches generated by DoodlerGAN over those drawn by humans! Our code can be found at https://github.com/facebookresearch/DoodlerGAN and a demo can be found at http://doodlergan.cloudcv.org.

Citations (70)

Summary

  • The paper introduces DoodlerGAN, a GAN-based model that generates creative sketches through sequential part-wise synthesis.
  • It presents two expansive datasets, Creative Birds and Creative Creatures, each with 10,000 creatively annotated sketches.
  • The evaluation using FID and new metrics demonstrates superior sketch fidelity and diversity compared to traditional methods.

Creative Sketch Generation: An Academic Overview

The paper "Creative Sketch Generation" by Songwei Ge, Vedanuj Goswami, C. Lawrence Zitnick, and Devi Parikh presents a novel approach to the automatic generation of creative sketches using a deep learning-based model, DoodlerGAN. This research contributes to the field of generative models by focusing on creative, rather than traditional, sketches, thereby expanding the application domain of generative adversarial networks (GANs) in creative processes.

Dataset Collection and Design

Two extensive datasets—Creative Birds and Creative Creatures—were introduced, each containing 10,000 creatively annotated sketches. Unlike existing datasets that capture canonical representations of objects, these datasets focus on artistic and imaginative representations of birds and generic creatures. They were collected via a structured sketching exercise using Amazon Mechanical Turk, designed to engage participants in a creative process.

DoodlerGAN Framework

DoodlerGAN, a part-based generative adversarial network, is the centerpiece of this paper. This model diverges from previous GAN approaches by focusing on generating novel compositions of sketches through part-wise generation. Each part of a sketch is generated sequentially, allowing for flexibility and creativity akin to human drawing patterns. The model architecture includes two key components: a part generator and a part selector. The part generator utilizes conditional GANs based on the StyleGAN2 architecture, while the part selector adapts a convolutional network to predict the next segment to draw. This design aims to foster diversity and creativity in generated sketches, leveraging the compositional nature of many creative designs.

Evaluation Metrics and Outcomes

The paper rigorously evaluates the model's performance both quantitatively and qualitatively. The Fréchet Inception Distance (FID) and generation diversity (GD) are computed using an Inception model trained on a comprehensive sketch dataset, serving as a benchmark. DoodlerGAN demonstrated superior FID scores, suggesting higher fidelity in sketch generation compared to baseline methods, including unconditional and conditional SketchRNNs and StyleGAN2 models.

Additionally, two new metrics, characteristic score (CS) and semantic diversity score (SDS), are introduced to measure the recognizability and diversity of generated sketches. Results indicate that DoodlerGAN outperforms competitive approaches in both producing recognizable entities and in generating a diverse array of creative outputs.

Implications and Future Directions

The implications of this research are significant for the field of artificial intelligence in creativity. By enabling machines to generate creative content that can engage with human artistry, this work paves the way for advanced AI-assisted design and brainstorming tools. DoodlerGAN's ability to outperform even human sketches in certain evaluations signifies its potential to enhance artistic processes.

Future research could further explore human-machine collaborative environments where AI supports artists in sketching endeavors. Moreover, expanding the current method to integrate more intricate part interactions could refine the ability to generate even more complex creative sketches. The introduction of highly creative datasets like Creative Birds and Creative Creatures provides a rich foundation for subsequent development in AI-driven creativity.

Conclusion

Overall, the paper presents a robust framework for creative sketch generation, showcasing how neural networks can transcend traditional boundaries of art and machine learning to produce innovative content. By leveraging the creative possibilities of AI, the research sets a precedent for future explorations into the symbiotic relationship between technology and human creativity. The release of their datasets, code, and a web demo invites further exploration and use, encouraging ongoing advancements in the domain of AI-assisted creative tools.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com