Unsupervised Diverse Colorization via Generative Adversarial Networks (1702.06674v2)

Published 22 Feb 2017 in cs.CV and cs.AI

Abstract: Colorization of grayscale images has been a hot topic in computer vision. Previous research mainly focuses on producing a colored image to match the original one. However, since many colors share the same gray value, an input grayscale image could be diversely colored while maintaining its reality. In this paper, we design a novel solution for unsupervised diverse colorization. Specifically, we leverage conditional generative adversarial networks to model the distribution of real-world item colors, in which we develop a fully convolutional generator with multi-layer noise to enhance diversity, with multi-layer condition concatenation to maintain reality, and with stride 1 to keep spatial information. With such a novel network architecture, the model yields highly competitive performance on the open LSUN bedroom dataset. The Turing test of 80 humans further indicates our generated color schemes are highly convincible.

Citations (173)

View on Semantic Scholar

Summary

The paper introduces an unsupervised conditional GAN method that generates multiple diverse and plausible colorizations for grayscale images.
It utilizes a novel generator architecture with convolutional layers and multi-layer noise/condition concatenation to preserve spatial information and enhance diversity.
Human evaluation showed that generated color images were perceived as realistic 62.6% of the time, statistically comparable to real images.

Unsupervised Diverse Colorization via Generative Adversarial Networks

The paper "Unsupervised Diverse Colorization via Generative Adversarial Networks," authored by Yun Cao, Zhiming Zhou, Weinan Zhang, and Yong Yu, presents an innovative approach to the colorization of grayscale images using Generative Adversarial Networks (GANs). The authors address the inherent limitation of deterministic colorization methods by introducing a model capable of generating multiple plausible colorizations for a given grayscale image without supervision.

Methodology

The proposed method leverages conditional GANs to model the distribution of real-world item colors, offering a versatile solution to the colorization problem. A novel generator architecture is central to this approach, utilizing fully convolutional layers with strides set to one to preserve spatial information. The generator incorporates multi-layer noise to enhance diversity and multi-layer condition concatenation to ensure realism in the generated images. Unlike traditional methods that employ supervised learning, the model follows an unsupervised learning paradigm, negating the need for extensive datasets with annotated color images.

Performance Evaluation

The method is evaluated on the LSUN bedroom dataset, demonstrating highly competitive performance. A noteworthy aspect of the evaluation is the use of a Turing test with 80 human subjects. Results indicate that the generated color images were perceived to be realistic 62.6% of the time, compared to 70.0% for real images—a difference not statistically significant according to the t-test performed.

Comparison and Architecture Choices

Key architectural choices differentiate this work from prior attempts in the field:

Convolution Structure: The generator eschews the conventional encoder-decoder structure with deconvolution layers in favor of exclusively using convolution layers, thus retaining spatial features crucial for realistic item separation in images.
Representation: The authors compare RGB and YUV representations, favoring the latter for stable training and more consistent results.
Noise Incorporation: By concatenating noise channels at multiple layers rather than a single initial layer, the generator effectively preserves the diversity in colorizations.
Conditional Architecture: The generator concatenates grayscale information across all layers, sustaining constant conditional supervision for robust output generation.

Implications and Future Directions

The outcomes suggest significant implications for realistic image generation and editing applications. The unsupervised nature of the model opens pathways to further research and application in varied colorization tasks and domains. The authors speculate on future work involving conditional constraints to guide the colorization process, such as item-specific colors or overall tone schemes, which could enhance the versatility of GANs in practical applications.

Given these advances, the presented approach holds potential for creative applications in digital media and automated image processing, positioning GAN-based diverse colorization as a promising avenue in the ongoing evolution of computer vision technologies.

Related Papers

Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior (2024)
Using colorization as a tool for automatic makeup suggestion (2019)
PixColor: Pixel Recursive Colorization (2017)
Learning Diverse Image Colorization (2016)
Multiple Hypothesis Colorization (2016)