Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DoveNet: Deep Image Harmonization via Domain Verification (1911.13239v3)

Published 27 Nov 2019 in cs.CV

Abstract: Image composition is an important operation in image processing, but the inconsistency between foreground and background significantly degrades the quality of composite image. Image harmonization, aiming to make the foreground compatible with the background, is a promising yet challenging task. However, the lack of high-quality publicly available dataset for image harmonization greatly hinders the development of image harmonization techniques. In this work, we contribute an image harmonization dataset iHarmony4 by generating synthesized composite images based on COCO (resp., Adobe5k, Flickr, day2night) dataset, leading to our HCOCO (resp., HAdobe5k, HFlickr, Hday2night) sub-dataset. Moreover, we propose a new deep image harmonization method DoveNet using a novel domain verification discriminator, with the insight that the foreground needs to be translated to the same domain as background. Extensive experiments on our constructed dataset demonstrate the effectiveness of our proposed method. Our dataset and code are available at https://github.com/bcmi/Image_Harmonization_Datasets.

Citations (186)

Summary

  • The paper introduces a novel domain verification approach that reframes image harmonization as a domain transfer problem.
  • It leverages an attention-enhanced U-Net with both global and verification discriminators to ensure consistent blending of foreground and background.
  • Experiments on the iHarmony4 dataset demonstrate improved MSE and PSNR metrics over existing methods, validating its superior performance.

DoveNet: Deep Image Harmonization via Domain Verification

The paper "DoveNet: Deep Image Harmonization via Domain Verification" addresses the complex challenge of image harmonization in the field of image processing. This task involves adjusting the foreground of a composite image to make it visually compatible with its background, a necessity for achieving photorealistic compositions. The authors identify a significant limitation in the field: the lack of a sufficiently large and diverse dataset for training and evaluating harmonization models. They respond by introducing iHarmony4, a comprehensive composite image dataset, and propose DoveNet, a novel approach leveraging domain verification for effective harmonization.

iHarmony4 Dataset

The authors present iHarmony4, which consists of four sub-datasets: HCOCO, HAdobe5K, HFlickr, and Hday2night. These are synthesized from well-known datasets such as Microsoft COCO and Adobe5K, as well as self-collected images from Flickr. The construction of this dataset involves segmenting foregrounds from original images and adjusting them using various color transfer techniques to generate new compositions. The iHarmony4 dataset aims to provide a more diverse and realistic collection of images by implementing automatic and manual filtering processes to ensure the quality of the composite images.

DoveNet Architecture

DoveNet introduces a new methodology employing a domain verification discriminator, drawing inspiration from adversarial learning. The key innovation is treating the harmonization task as a domain transfer problem, where the goal is to transform the foreground to match the domain characteristics of the background. This is achieved without labeling the domains explicitly by utilizing a discriminator that assesses whether the foreground and background appear consistent. The architecture is built upon an attention-enhanced U-Net generator, using both global and verification discriminators to ensure the generation of harmonious images.

Experimental Results

Extensive experiments validate the effectiveness of DoveNet on the iHarmony4 dataset. The paper includes comparisons with both traditional and state-of-the-art deep learning harmonization methods, demonstrating DoveNet's superior performance in terms of Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR). Additionally, the authors present ablation studies to underline the contribution of each component, showing that the incorporation of domain verification discriminator significantly enhances harmonization quality.

Implications and Future Directions

The release of the iHarmony4 dataset and the development of DoveNet have notable implications for the field of image processing and computer vision. By bridging the gap in available training resources, the authors enable further advancements in learning-based harmonization techniques. Moreover, the introduction of domain verification strategies may stimulate the exploration of similar approaches in other areas of image synthesis and transformation.

Future research could focus on optimizing domain verification methods and exploring unsupervised or semi-supervised harmonization techniques. Additionally, applying DoveNet's principles to broader image-to-image translation tasks could unveil new opportunities for generating consistent visual content in diverse applications. The potential integration with video processing and enhancement of real-time capabilities presents another area for future exploration.