Semi-Supervised Learning for Face Sketch Synthesis in the Wild (1812.04929v2)

Published 12 Dec 2018 in cs.CV

Abstract: Face sketch synthesis has made great progress in the past few years. Recent methods based on deep neural networks are able to generate high quality sketches from face photos. However, due to the lack of training data (photo-sketch pairs), none of such deep learning based methods can be applied successfully to face photos in the wild. In this paper, we propose a semi-supervised deep learning architecture which extends face sketch synthesis to handle face photos in the wild by exploiting additional face photos in training. Instead of supervising the network with ground truth sketches, we first perform patch matching in feature space between the input photo and photos in a small reference set of photo-sketch pairs. We then compose a pseudo sketch feature representation using the corresponding sketch feature patches to supervise our network. With the proposed approach, we can train our networks using a small reference set of photo-sketch pairs together with a large face photo dataset without ground truth sketches. Experiments show that our method achieve state-of-the-art performance both on public benchmarks and face photos in the wild. Codes are available at https://github.com/chaofengc/Face-Sketch-Wild.

Citations (33)

View on Semantic Scholar

Summary

The paper presents a novel semi-supervised approach that uses a pseudo sketch feature generator to synthesize accurate face sketches from uncontrolled photos.
It combines exemplar-based techniques with generative adversarial networks, incorporating perceptual and total variation losses to enhance sketch realism.
Empirical results show state-of-the-art performance with improved SSIM and FSIM scores, demonstrating efficient generalization to in-the-wild face photos.

Semi-Supervised Learning for Face Sketch Synthesis in the Wild

The paper under review presents a semi-supervised learning framework for face sketch synthesis, specifically targeting the challenge of adapting to face photos captured under uncontrolled conditions, commonly referred to as "face photos in the wild." Traditional approaches to face sketch synthesis, both exemplar-based and learning-based, have faced significant limitations due to their reliance on pre-aligned photo-sketch datasets and the computational intensity of patch matching. This paper introduces an innovative semi-supervised deep neural network architecture that mitigates these limitations and extends the applicability of face sketch synthesis to more varied and unpredictable data.

The authors propose a hybrid approach, combining elements of exemplar-based methods and generative adversarial networks (GANs), along with the incorporation of perceptual loss. The cornerstone of their methodology is the use of a pseudo sketch feature generator. This feature generator approximates the sketch representation of a face photo by matching patches of photo data in a deeper feature space using a pre-trained VGG-19 network. Matching is performed based on cosine distances between photo patches, and corresponding sketch patches are used to construct a pseudo sketch feature. This enables the training of the network without needing a large dataset of paired photo-sketch mappings, thereby extending generalization abilities beyond controlled datasets.

The architecture employs a residual network with skip connections as the generator model, which synthesizes sketches directly from photos by minimizing a perceptual loss derived from pseudo sketch features and adversarial loss to enhance realism. The inclusion of total variation loss aids in reducing unnatural artifacts and noise in the generated sketches, further refining output quality.

The empirical results demonstrate that this model achieves state-of-the-art performance on public face sketch benchmarks, showing superior or competitive SSIM and FSIM scores when compared to existing methods, including those using GAN frameworks specifically designed for sketch synthesis. The quantitative results are further supported by qualitative assessments demonstrating the model's ability to handle diverse conditions of "in-the-wild" face photos, outlasting the performance of purely data-driven methods which often lack generalization capability without extensive datasets. Acknowledging the efficacy of the pseudo sketch feature loss in maintaining facial structure and details highlights the impact this contribution has on model performance.

Additional gains come from addressing computation time, a critical concern in practical applications, where the model is capable of generating sketches rapidly due to its efficient architecture. The use of a relatively small set of aligned photo-sketch pairs as a reference set, accompanied by a larger corpus of unpaired face photos, showcases the efficient semi-supervised nature of the approach, allowing it to overcome previous constraints on dataset size and diversity.

In conclusion, this research makes a significant contribution to the field of face sketch synthesis by not only providing a novel method for leveraging small reference datasets but also by illustrating how additional, unpaired images can be exploited to boost model generalization and performance in varying conditions. The incorporation of sophisticated loss functions within a semi-supervised framework presents a versatile solution to traditional face sketch synthesis challenges.

Future research directions could explore further improvements in the perceptual loss metrics which might leverage more advanced deep learning models and architectures, potentially enhancing the fidelity and detail of generated sketches. Leveraging more comprehensive feature representations could also enhance the model's resilience to more extreme conditions in the wild, such as occlusions or lower-quality inputs. These developments would expand the practical applications of AI in fields requiring sketch synthesis, from law enforcement to digital entertainment.

PDF Markdown

Related Papers

GitHub

GitHub - chaofengc/Face-Sketch-Wild: Semi-Supervised Learning for Face Sketch Synthesis in the Wild, ACCV2018 (88 stars)

YouTube

Show All Videos