Style Transfer by Relaxed Optimal Transport and Self-Similarity (1904.12785v2)

Published 29 Apr 2019 in cs.CV

Abstract: Style transfer algorithms strive to render the content of one image using the style of another. We propose Style Transfer by Relaxed Optimal Transport and Self-Similarity (STROTSS), a new optimization-based style transfer algorithm. We extend our method to allow user-specified point-to-point or region-to-region control over visual similarity between the style image and the output. Such guidance can be used to either achieve a particular visual effect or correct errors made by unconstrained style transfer. In order to quantitatively compare our method to prior work, we conduct a large-scale user study designed to assess the style-content tradeoff across settings in style transfer algorithms. Our results indicate that for any desired level of content preservation, our method provides higher quality stylization than prior work. Code is available at https://github.com/nkolkin13/STROTSS

Citations (261)

View on Semantic Scholar

Summary

The paper introduces STROTSS, a style transfer method that employs relaxed optimal transport and self-similarity for improved stylization.
It utilizes an objective function combining content loss inspired by self-similarity and style loss based on an approximated Earth Mover’s Distance.
A user study with 662 participants validated its superior performance in preserving content and achieving high stylization quality compared to previous methods.

Style Transfer by Relaxed Optimal Transport and Self-Similarity: A Comprehensive Analysis

This essay examines the "Style Transfer by Relaxed Optimal Transport and Self-Similarity" (STROTSS) algorithm, proposed by Nicholas Kolkin, Jason Salavon, and Gregory Shakhnarovich. This work introduces a novel approach to style transfer, a crucial task within computer vision.

Core Contributions and Methodology

The authors introduce STROTSS, an optimization-based style transfer method leveraging relaxed optimal transport and self-similarity principles. The algorithm addresses the challenge of formalizing 'content' and 'style' by defining style as a distribution over features extracted by a deep neural network. The distance between these distributions is measured using an approximation of the Earth Movers Distance (EMD). Content is defined through self-similarity, allowing the maintenance of spatial semantics without stringent adherence to pixel values.

Algorithm Details

STROTSS employs a gradient descent variant, RMSprop, to minimize a proposed objective function. The loss function comprises:

Content Loss: Inspired by local self-similarity, ensuring the feature space structure remains invariant between content and output.
Style Loss: Derived from the Earth Movers Distance, supplemented with moment matching and color matching losses to handle saturation and palette issues effectively.

Additionally, STROTSS allows user-directed control over the transfer process through point-to-point or region-to-region guidance, enhancing its utility as an artistic tool.

Evaluation

A comprehensive user paper was conducted via Amazon Mechanical Turk, with 662 participants evaluating the style-content tradeoff. STROTSS demonstrated superior performance in delivering high-quality stylization while preserving content semantics compared to previous methods.

Quantitative Insights

The user paper results revealed that STROTSS consistently outperforms existing methods for any desired level of content preservation, showcasing higher stylization quality. This positions the algorithm prominently within the landscape of style transfer techniques.

Computational Efficiency

The team also addressed computational concerns. While comparatively slower at lower resolutions than some counterparts, STROTSS scales effectively, maintaining competitive processing times at higher resolutions. This efficiency stems from optimizing the Laplacian pyramid rather than raw pixels, streamlining convergence.

Theoretical and Practical Implications

The introduction of self-similarity in defining content marks a significant theoretical advance, potentially influencing pattern recognition systems beyond style transfer. Furthermore, the practical applicability of STROTSS, with its intuitive user-control capabilities, opens avenues in digital art and media production.

Future Directions

Future research could explore more sophisticated EMD approximations to refine further the stylistic fidelity of the transfer process. Another potential path is training feed-forward networks using the STROTSS framework to accelerate style transfer, bridging the gap between quality and real-time performance.

In summary, STROTSS introduces a well-founded, effective approach to style transfer, offering both theoretical and practical benefits. Its rigorous evaluation and demonstrated superior performance underscore its potential impact on the field of computer vision.

PDF Markdown

Related Papers

GitHub

GitHub - nkolkin13/STROTSS: Style Transfer by Relaxed Optimal Transport and Self-Similarity (CVPR 2019) (312 stars)

YouTube

Show All Videos