High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks (1710.10182v2)

Published 27 Oct 2017 in cs.CV

Abstract: Synthesizing face sketches from real photos and its inverse have many applications. However, photo/sketch synthesis remains a challenging problem due to the fact that photo and sketch have different characteristics. In this work, we consider this task as an image-to-image translation problem and explore the recently popular generative models (GANs) to generate high-quality realistic photos from sketches and sketches from photos. Recent GAN-based methods have shown promising results on image-to-image translation problems and photo-to-sketch synthesis in particular, however, they are known to have limited abilities in generating high-resolution realistic images. To this end, we propose a novel synthesis framework called Photo-Sketch Synthesis using Multi-Adversarial Networks, (PS2-MAN) that iteratively generates low resolution to high resolution images in an adversarial way. The hidden layers of the generator are supervised to first generate lower resolution images followed by implicit refinement in the network to generate higher resolution images. Furthermore, since photo-sketch synthesis is a coupled/paired translation problem, we leverage the pair information using CycleGAN framework. Both Image Quality Assessment (IQA) and Photo-Sketch Matching experiments are conducted to demonstrate the superior performance of our framework in comparison to existing state-of-the-art solutions. Code available at: https://github.com/lidan1/PhotoSketchMAN.

PDF Abstract

High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks: An Overview

The paper, "High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks," introduces a novel framework for synthesizing high-resolution, realistic facial sketches from photos and vice versa, leveraging recent advancements in generative adversarial networks (GANs). The approach is designed to surpass the limitations of previous GAN-based methods, particularly their difficulty in generating high-resolution outputs.

Problem Context

Facial photo-sketch synthesis holds significant utility in various domains, including law enforcement and digital entertainment. The challenge in this domain stems from the intrinsic differences in the characteristics of photos and sketches. Traditional methodologies for addressing this issue have been insufficient, primarily due to their inability to bridge the stylistic gaps adequately.

Methodology

The research frames the facial photo-sketch synthesis task as an image-to-image translation problem and introduces the PS\textsuperscript{2}-MAN framework. The framework extends the GAN architecture by iteratively generating images from low to high resolution in an adversarial setting. This iterative synthesis is a pivotal innovation in the framework, involving multi-adversarial networks that supervise the generator's hidden layers to produce progressively refined images.

Multi-Adversarial Networks

The PS\textsuperscript{2}-MAN framework integrates multiple discriminator networks that apply supervisory signals to various resolution levels, enabling the generation of images that are not only high-resolution but also devoid of artifacts. This multi-level supervision enhances the generator's ability to refine image features iteratively and translates into improved visual quality and realism of the synthesized images.

Comparative Analysis

The performance of PS\textsuperscript{2}-MAN is benchmarked against existing state-of-the-art synthesis techniques, such as Pix2Pix, CycleGAN, and DualGAN. Both quantitative metrics (SSIM, FSIM) and qualitative assessments indicate marked improvements over traditional approaches. The framework also exhibits superior results in photo-sketch matching experiments, highlighting both its practical efficacy and robustness.

Implications and Future Work

The paper's contributions have dual significance. Theoretically, it elucidates the advantages of adopting a multi-adversarial approach within generative models for image synthesis tasks. Practically, the framework's capacity to generate artifact-free, high-resolution images presents considerable implications for applications demanding high fidelity between synthesized and real images, such as biometric identification.

Future research avenues could explore the application of the PS\textsuperscript{2}-MAN framework in other domains requiring high-resolution image translations, including stylized image generation and more complex heterogeneous image-to-image translation tasks. Further, refinements in network architecture and training strategies might enhance convergence speeds or reduce computational overhead, making the framework more accessible for real-time applications.

In conclusion, the PS\textsuperscript{2}-MAN framework provides a nuanced advancement in the field of photo-sketch synthesis, offering both methodological innovations and practical applications. Its success in improving synthesis quality and matching accuracy underscores its potential for broader applications and sets a new benchmark for future research in generative model-based image synthesis.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Lidan Wang (22 papers)
Vishwanath A. Sindagi (21 papers)
Vishal M. Patel (230 papers)

Citations (128)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - lidan1/PhotoSketchMAN: Code for paper "High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks" (40 stars)