Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SC-FEGAN: Face Editing Generative Adversarial Network with User's Sketch and Color (1902.06838v1)

Published 18 Feb 2019 in cs.CV

Abstract: We present a novel image editing system that generates images as the user provides free-form mask, sketch and color as an input. Our system consist of a end-to-end trainable convolutional network. Contrary to the existing methods, our system wholly utilizes free-form user input with color and shape. This allows the system to respond to the user's sketch and color input, using it as a guideline to generate an image. In our particular work, we trained network with additional style loss which made it possible to generate realistic results, despite large portions of the image being removed. Our proposed network architecture SC-FEGAN is well suited to generate high quality synthetic image using intuitive user inputs.

Citations (288)

Summary

  • The paper introduces a novel GAN architecture that uses free-form sketch and color inputs for interactive face editing.
  • It employs a U-net generator with gated convolutional layers and an SN-PatchGAN discriminator to maintain realistic details.
  • The research demonstrates robust image completion by integrating style loss with advanced training on the CelebA-HQ dataset.

SC-FEGAN: An Overview of a Face Editing Generative Adversarial Network

The paper "SC-FEGAN: Face Editing Generative Adversarial Network with User's Sketch and Color" by Youngjoo Jo and Jongyoul Park introduces a generative adversarial network (GAN) architecture specifically designed for interactive face image editing. The model leverages free-form input in the form of masks, sketches, and color to produce realistic and high-quality images, addressing the common challenges faced in image completion tasks.

Core Contributions and Methodology

The primary contribution of SC-FEGAN lies in its enhanced interactivity and flexibility for image editing tasks. The system exploits user-provided sketches and color inputs to guide the image generation process, allowing for modifications even in images with extensive erased portions. The architecture deviates from prior models by incorporating style loss alongside a GAN loss to maintain realism in the generated outputs. Notably, SC-FEGAN employs a U-net-like generator architecture equipped with gated convolutional layers, which facilitates efficient training and inference.

Key contributions include:

  • A novel network architecture utilizing a U-net structure combined with gated convolutional layers, offering superior performance in training and inference over previous Coarse-Refined networks.
  • Implementation of an SN-PatchGAN discriminator, effectively managing awkward edges typically encountered in image completion tasks.
  • A comprehensive training approach incorporating style loss into the GAN framework, augmenting the network’s capability to edit substantial image sections while maintaining fine details like hairstyles or accessories such as earrings.

Training Data Generation

A significant aspect of this research involves the creation of suitable training data using the CelebA-HQ dataset, which was processed to include sketch and color domains that mimic user inputs. This involved employing HED edge detection for sketch generation and median filtering followed by GFC segmentation for color domain creation. Additionally, free-form masks centered on eye positions were utilized to enhance the treatment of facial details, exemplifying the robustness of the data preparation strategy.

Comparisons and Evaluations

The paper addresses limitations of previous works such as Deepfill and FaceShop by demonstrating improved edge management and detail preservation in image completion tasks. SC-FEGAN surpasses existing methods by producing visually convincing results even under conditions where large image areas are occluded or require significant modification. Key evaluations highlighted the model’s ability to generate coherent results using inputs ranging from minor adjustments to substantial facial feature modifications.

Practical Implications and Future Directions

Practically, SC-FEGAN opens avenues for enhanced user-driven image editing applications, offering a tool that can be utilized by non-experts to generate professional-quality alterations in face images. This has significant implications for industries reliant on visual content such as digital media, entertainment, and online personalization platforms.

Theoretically, the integration of style loss within the GAN framework in SC-FEGAN provides a future pathway for exploration in balancing content and style in various image translation tasks. The possibility of expanding SC-FEGAN’s approach to other image domains could be considered to enhance interactive image editing capabilities beyond facial recognition tasks.

In conclusion, SC-FEGAN presents a compelling advance in the field of GAN-based image editing, providing a model that effectively balances user interactivity with high-quality image synthesis. Future explorations may refine this balance further, extending its application to broader contexts in image manipulation and generation.

Youtube Logo Streamline Icon: https://streamlinehq.com