Universal Style Transfer via Feature Transforms (1705.08086v2)

Published 23 May 2017 in cs.CV

Abstract: Universal style transfer aims to transfer arbitrary visual styles to content images. Existing feed-forward based methods, while enjoying the inference efficiency, are mainly limited by inability of generalizing to unseen styles or compromised visual quality. In this paper, we present a simple yet effective method that tackles these limitations without training on any pre-defined styles. The key ingredient of our method is a pair of feature transforms, whitening and coloring, that are embedded to an image reconstruction network. The whitening and coloring transforms reflect a direct matching of feature covariance of the content image to a given style image, which shares similar spirits with the optimization of Gram matrix based cost in neural style transfer. We demonstrate the effectiveness of our algorithm by generating high-quality stylized images with comparisons to a number of recent methods. We also analyze our method by visualizing the whitened features and synthesizing textures via simple feature coloring.

Authors (6)

Yijun Li (56 papers)
Chen Fang (157 papers)
Jimei Yang (58 papers)
Zhaowen Wang (55 papers)
Xin Lu (165 papers)
Ming-Hsuan Yang (377 papers)

Citations (925)

View on Semantic Scholar

Summary

The paper demonstrates that feature whitening and coloring transforms within a feed-forward network enable real-time universal style transfer.
The approach converts content features into a decorrelated space and then re-imposes style by matching the covariance of style features.
Empirical results show superior visual fidelity and efficiency compared to traditional optimization-based style transfer methods.

Universal Style Transfer via Feature Whitening and Coloring Transforms

The paper "Universal Style Transfer via Feature Whitening and Coloring Transforms" demonstrates a novel approach to universal style transfer, addressing the limitations of existing methods in terms of generalization, quality, and efficiency. The methodology presented hinges on feature transforms—specifically, whitening and coloring transforms (WCT)—embedded within an image reconstruction network to facilitate style transfer in a feed-forward manner.

Core Concept

The foundational concept of this paper is utilizing WCT to directly match the feature statistics of a content image with that of a style image. This approach deviates from previous methods that often relied on either iterative optimization processes, such as Gram matrix minimization, or training feed-forward networks on specific styles.

Methodology

The authors employ the VGG-19 network as a feature extractor and introduce a symmetric decoder for image reconstruction, thus enabling a feed-forward reconstruction of images from deep features. The WCT mechanism is pivotal, functioning in two critical stages:

Whitening Transform: This decorrelates the content features by transforming them into a whitened space, effectively stripping the style characteristics of the original content image while maintaining structural information.
Coloring Transform: The whitened content features are then transformed such that their covariance matches that of the style features, incorporating the desired stylistic attributes into the content image.

This method is extended to a multi-level stylization framework, where WCT is applied across multiple layers of VGG-19 features sequentially. This multi-layer approach captures a broader spectrum of style characteristics, from low-level textures to high-level structural patterns.

Results

The resulting stylized images generated via this method demonstrate a high degree of visual quality, often preserving detailed stylistic elements better than existing approaches. Moreover, the authors provide a user control mechanism to balance the degree of stylization, enhancing the practical utility of the method.

Quantitative evaluation metrics, including covariance matrix differences and user preference studies, affirm the superiority of this method. Notably, the technique generalizes across a wide array of styles without requiring style-specific training.

Comparative Analysis

Substantial comparisons with other notable methods—Chen et al.'s patch-based swapping, Huang et al.'s adaptive instance normalization, the feed-forward style transfer network by Johnson et al., and the optimization-based method by Gatys et al.—highlight the advantages in terms of style fidelity and efficiency. The results suggest that WCT not only preserves content structure better but also produces more visually appealing textures in various stylization tasks.

Implications and Future Work

The implications of this research are multifaceted:

Efficiency: The proposed method offers a significant reduction in computational overhead compared to optimization-based techniques, making real-time applications feasible.
Generality: The learning-free nature of WCT allows for immediate application to unseen styles, overcoming a major hurdle faced by feed-forward neural networks trained on fixed style sets.
Flexibility: User controls for style intensity and spatial domains render this method versatile for practical image editing tasks.

Future developments could explore further optimization of the WCT mechanism, particularly its implementation efficiency on GPUs to accommodate higher resolutions and more intricate styles. Additionally, expanding this framework to other domains such as video style transfer or 3D model texturing could open new avenues for research and commercial applications.

Conclusion

This paper contributes a robust, efficient, and generalizable framework for universal style transfer, leveraging the concept of feature statistics matching through whitening and coloring transforms. The empirical evidence and qualitative assessments underscore the effectiveness of this method, distinguishing it as a significant step forward in the field of neural style transfer.

PDF Markdown

Related Papers

GitHub

GitHub - Yijunmaverick/UniversalStyleTransfer: The source code of NIPS17 'Universal Style Transfer via Feature Transforms'. (593 stars)

YouTube

Show All Videos