Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer (2203.13248v1)

Published 24 Mar 2022 in cs.CV and cs.LG

Abstract: Recent studies on StyleGAN show high performance on artistic portrait generation by transfer learning with limited data. In this paper, we explore more challenging exemplar-based high-resolution portrait style transfer by introducing a novel DualStyleGAN with flexible control of dual styles of the original face domain and the extended artistic portrait domain. Different from StyleGAN, DualStyleGAN provides a natural way of style transfer by characterizing the content and style of a portrait with an intrinsic style path and a new extrinsic style path, respectively. The delicately designed extrinsic style path enables our model to modulate both the color and complex structural styles hierarchically to precisely pastiche the style example. Furthermore, a novel progressive fine-tuning scheme is introduced to smoothly transform the generative space of the model to the target domain, even with the above modifications on the network architecture. Experiments demonstrate the superiority of DualStyleGAN over state-of-the-art methods in high-quality portrait style transfer and flexible style control.

Authors (4)

Shuai Yang (140 papers)
Liming Jiang (29 papers)
Ziwei Liu (368 papers)
Chen Change Loy (288 papers)

Citations (96)

View on Semantic Scholar

Summary

The paper presents a dual-path approach that separates intrinsic and extrinsic style modulation for precise exemplar-based high-resolution portrait transfer.
It employs a progressive fine-tuning training scheme, starting with same-domain tasks before advancing to complex cross-domain style transfers.
Empirical results show that DualStyleGAN outperforms existing models by preserving facial attributes and ensuring high-fidelity, 1024x1024 output quality.

Exemplar-Based High-Resolution Portrait Style Transfer via DualStyleGAN

The presented paper addresses the increasingly prominent area of exemplar-based high-resolution portrait style transfer, proposing a new method termed DualStyleGAN. This model extends upon the foundational StyleGAN framework, which, although adept at artistic portrait generation through transfer learning, initially lacks the ability to offer exemplar-specific style transfers. DualStyleGAN introduces a dual-path strategy, wherein an intrinsic style path continues to handle original domain styles and an extrinsic style path facilitates intricate style adaptation from exemplar portraits.

Methodological Advancements

DualStyleGAN innovatively employs dual style paths to distinctly modulate content and style within portraits. The intrinsic path retains traditional features of StyleGAN, whereas the extrinsic path utilizes hierarchical transformations to align extrinsic styles precisely with the input exemplar styles. The paper emphasizes the hierarchical control afforded by the extrinsic style path, which covertly embeds structural modifications at lower layers and color adjustments at finer layers.

Training Scheme and Model Fine-tuning

A key innovation presented is the progressive fine-tuning approach designed for DualStyleGAN. This involves multi-phase training where the model first stabilizes on simpler tasks—such as same-domain style modulation—before advancing to more complex, cross-domain style transfers. Progressive tuning, accompanied by an adaptive fine-tuning framework, allows for graceful convergence of the model's generative space towards the target artistic domain without compromising on content accuracy or style diversity.

Furthermore, facial destylization is introduced as a novel component to rectify input-to-target misalignments by creating realistic facsimiles of artistic exemplars. This forms a basis for supervised learning, enriching model adaptability to varying styles while avoiding overfitting to specific domain artifacts.

Empirical Evidence and Comparison

The experimental setup demonstrates DualStyleGAN's superiority over contemporary models, namely StarGANv2, GNR, UI2I-style, Toonify, and FS-Ada, by achieving superior preservation of facial attributes and enhanced fidelity to exemplar styles. The capacity of DualStyleGAN to maintain detailing in high-resolution (1024x1024) outputs further distinguishes it from alternatives usually constrained to lower resolutions.

Quantitatively, user preference scores corroborate the model's performance, damning competing methods particularly in exemplar-specific structure transfers which otherwise showcase significant challenges within existing frameworks. Qualitative results highlight the model's versatility in handling vast style variations, from caricature and cartoon to anime with substantial precision.

Implications and Future Prospects

The DualStyleGAN model presents substantial implications for the stylization and media content creation sectors, where high-quality aesthetic transformation of images is paramount. The insights from residual connections and progressive fine-tuning might also anchor future developments in generative tasks involving complex cross-domain mappings. Future work could extend upon this architecture to tackle issues related to data bias, broaden the range of applicable styles, or streamline model trainings for broader accessibility.

This comprehensive approach coupled with innovative architectural modifications not only addresses existing limitations in style transfer tasks but also paves the way for future developments in generative image tasks requiring high precision and nuanced understanding of aesthetic transformations.

PDF Markdown

Related Papers

YouTube

Show All Videos