Making Images Real Again: A Comprehensive Survey on Deep Image Composition

Published 28 Jun 2021 in cs.CV | (2106.14490v6)

Abstract: As a common image editing operation, image composition (object insertion) aims to combine the foreground from one image and another background image, resulting in a composite image. However, there are many issues that could make the composite images unrealistic. These issues can be summarized as the inconsistency between foreground and background, which includes appearance inconsistency (e.g., incompatible illumination), geometry inconsistency (e.g., unreasonable size), and semantic inconsistency (e.g., mismatched semantic context). Image composition task could be decomposed into multiple sub-tasks, in which each sub-task targets at one or more issues. Specifically, object placement aims to find reasonable scale, location, and shape for the foreground. Image blending aims to address the unnatural boundary between foreground and background. Image harmonization aims to adjust the illumination statistics of foreground. Shadow (resp., reflection) generation aims to generate plausible shadow (resp., reflection) for the foreground. These sub-tasks can be executed sequentially or parallelly to acquire realistic composite images. To the best of our knowledge, there is no previous survey on image composition (object insertion). In this paper, we conduct comprehensive survey over the sub-tasks and combinatorial task of image composition (object insertion). For each one, we summarize the existing methods, available datasets, and common evaluation metrics. We have also contributed the first image composition toolbox libcom, which assembles 10+ image composition related functions.

Abstract PDF Upgrade to Chat

Citations (72)

View on Semantic Scholar

Summary

The paper presents an exhaustive survey on deep image composition that addresses challenges like appearance, geometric, and semantic inconsistencies.
It reviews key sub-tasks including object placement, image blending, harmonization, and shadow generation using both rule-based and deep learning methods.
The study highlights the impact of generative models and foreground object search techniques, paving the way for advanced, standardized AI-driven image editing.

A Comprehensive Survey on Deep Image Composition

The paper "Making Images Real Again: A Comprehensive Survey on Deep Image Composition" by Li Niu et al. provides an exhaustive overview of deep image composition, a crucial technique in image editing that combines a foreground from one image with a different background to create a composite image. The main challenge addressed is the realism of composite images, which often suffer from inconsistencies between foreground and background. These inconsistencies can be of various types, such as appearance, geometry, and semantic discrepancies. In this survey, the authors detail the various sub-tasks involved in image composition, namely object placement, image blending, image harmonization, shadow generation, and methods for combinatorial tasks such as generative image composition and foreground object search.

Inconsistencies in Image Composition

The paper first classifies the inconsistencies that hinder the realism of composite images:

Appearance Inconsistency: Includes abrupt boundaries, incompatible illumination, missing shadows or reflections, and resolution discrepancies. Techniques such as image blending and harmonization are aimed at addressing these inconsistencies.
Geometric Inconsistency: Concerns with the unreasonable scale, location, and perspective of the foreground object. Object placement techniques attempt to rectify these issues, often relying on spatial transformations or predictive models.
Semantic Inconsistency: Arises when the composite image depicts objects in unreasonable contexts or interactions. The authors discuss how advanced sub-task methods are beginning to address these issues, albeit indirectly.

Sub-Tasks and Methodologies

Object Placement: This task involves choosing the appropriate scale, location, and transformation for the foreground object. Various methods, including traditional rule-based systems and new deep learning approaches that predict optimal transformation parameters, are discussed.
Image Blending: Techniques that create a seamless transition between foreground and background are explored, with some leveraging multi-scale methods or gradient domain consistency.
Image Harmonization: Methods that adjust foreground appearance to match background illumination are covered extensively, highlighting both rendering-based and non-rendering-based approaches.
Shadow Generation: This sub-task focuses on creating shadows that are consistent with the composite setup, often utilizing both traditional rendering techniques and deep learning models.

Generative Image Composition

A significant paradigm shift is introduced with generative models, especially leveraging recent advances in diffusion models. These models address multiple sub-tasks in a unified manner, such as image blending and harmonization, providing an all-encompassing solution to create realistic composites.

Foreground Object Search

Furthermore, the paper examines methods to search and retrieve compatible foreground objects from a library, emphasizing the importance of compatibility in different dimensions such as geometry and semantics. This can significantly reduce the complexity of generating cohesive images.

Implications and Future Directions

The survey offers insights into the practical applications of image composition, ranging from entertainment to augmented reality and virtual design. With the advent of advanced AI models, future research may enable more sophisticated and automated composition tasks, expanding to domains like video and 3D composition. The introduction of comprehensive tools and datasets, as highlighted by the authors, marks a pivotal step towards standardized benchmarks in this field.

In conclusion, this survey not only synthesizes the current state of image composition research but also establishes a foundation for future advancements, particularly in addressing deeper semantic and contextual inconsistencies in composite images. The implications of this work suggest substantial potential for AI-driven image editing in numerous applications.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Making Images Real Again: A Comprehensive Survey on Deep Image Composition

Summary

A Comprehensive Survey on Deep Image Composition

Inconsistencies in Image Composition

Sub-Tasks and Methodologies

Generative Image Composition

Foreground Object Search

Implications and Future Directions

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (7)

Collections

GitHub

Making Images Real Again: A Comprehensive Survey on Deep Image Composition

Summary

A Comprehensive Survey on Deep Image Composition

Inconsistencies in Image Composition

Sub-Tasks and Methodologies

Generative Image Composition

Foreground Object Search

Implications and Future Directions

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (7)

Collections

GitHub