Papers
Topics
Authors
Recent
2000 character limit reached

FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

Published 11 Sep 2024 in cs.CV and cs.MM | (2409.07451v1)

Abstract: The emergence of text-to-image generation models has led to the recognition that image enhancement, performed as post-processing, would significantly improve the visual quality of the generated images. Exploring diffusion models to enhance the generated images nevertheless is not trivial and necessitates to delicately enrich plentiful details while preserving the visual appearance of key content in the original image. In this paper, we propose a novel framework, namely FreeEnhance, for content-consistent image enhancement using the off-the-shelf image diffusion models. Technically, FreeEnhance is a two-stage process that firstly adds random noise to the input image and then capitalizes on a pre-trained image diffusion model (i.e., Latent Diffusion Models) to denoise and enhance the image details. In the noising stage, FreeEnhance is devised to add lighter noise to the region with higher frequency to preserve the high-frequent patterns (e.g., edge, corner) in the original image. In the denoising stage, we present three target properties as constraints to regularize the predicted noise, enhancing images with high acutance and high visual quality. Extensive experiments conducted on the HPDv2 dataset demonstrate that our FreeEnhance outperforms the state-of-the-art image enhancement models in terms of quantitative metrics and human preference. More remarkably, FreeEnhance also shows higher human preference compared to the commercial image enhancement solution of Magnific AI.

Citations (2)

Summary

  • The paper introduces FreeEnhance, a novel tuning-free framework leveraging latent diffusion models for image enhancement through a content-consistent noising and denoising process.
  • FreeEnhance employs a two-stage method involving frequency-aware noising and constrained denoising guided by objectives for image acutance, noise distribution, and adversarial regularization.
  • Experiments show FreeEnhance outperforms state-of-the-art models and commercial solutions like Magnific AI in quantitative metrics and human preference on datasets like HPDv2, demonstrating its effectiveness and generalization.

The paper introduces FreeEnhance, a framework for content-consistent image enhancement utilizing pre-trained image diffusion models, specifically addressing the challenge of enriching image details while preserving key content from the original image. FreeEnhance employs a two-stage process involving noising and denoising via Latent Diffusion Models (LDM).

The noising stage is designed to add lighter noise to regions with higher frequency, preserving high-frequency patterns like edges and corners. Conversely, lower-frequency regions receive heavier noise to introduce details. DDIM inversion is used to apply light noise to high-frequency regions, while random noise with higher intensity is introduced to low-frequency regions.

In the denoising stage, three target properties act as constraints to regularize predicted noise, enhancing images with high acutance and high visual quality. These constraints are:

  • Image Acutance: Enhances edge contrast, making images appear sharper. The objective function is:

    Lacu=1HWi=0,j=0H,WV(Facu(x^t0)(i,j))Facu(x^t0)(i,j)\mathcal{L}_{acu}=-\frac{1}{HW}\sum_{i=0,j=0}^{H,W}V(\mathcal{F}_{acu}(\hat{x}_{t\rightarrow0})_{(i,j)})\mathcal{F}_{acu}(\hat{x}_{t\rightarrow0})_{(i,j)}

    where:

    • Lacu\mathcal{L}_{acu} is the acutance loss.
    • H,WH, W represent the spatial size of the noisy image.
    • (i,j)(i,j) are the indices of the spatial element.
    • Facu\mathcal{F}_{acu} is the Sobel operator.
    • x^t0\hat{x}_{t\rightarrow0} is the intermediate reconstruction of x0x_0 at the timestep tt.
    • V()V(\cdot) is a binary indicator function.
  • Noise Distribution: Addresses the generalization error where predicted noise may not follow a Gaussian distribution. The objective is:

    Ldist=1Fvar(ϵθ(xt;t,y))2\mathcal{L}_{dist}=||1 - \mathcal{F}_{var}(\epsilon_\theta(x_t; t, y))||_2

    where:

    • Ldist\mathcal{L}_{dist} is the distribution loss.
    • ϵθ(xt;t,y)\epsilon_\theta(x_t; t, y) is the noise predicted by a diffusion model.
    • Fvar\mathcal{F}_{var} calculates the variance of the predicted noise.
  • Adversarial Regularization: Prevents blurred images by incorporating a Gaussian blur function:

    Ladv=x^t0Fblur(x^t0)2\mathcal{L}_{adv} = ||\hat{x}_{t\rightarrow0} - \mathcal{F}_{blur}(\hat{x}_{t\rightarrow0})||_2

    where:

    • Ladv\mathcal{L}_{adv} is the adversarial loss.
    • x^t0\hat{x}_{t\rightarrow0} is the intermediate reconstruction of x0x_0 at the timestep tt.
    • Fblur\mathcal{F}_{blur} is a Gaussian blur function.

The sampling result xt1x_{t-1} in each denoising operation is altered by xt1x_{t-1}^*:

xt1=xt1ρacuxtLacuρdistxtLdistρadvxtLadvx_{t-1}^* = x_{t-1} - \rho_{acu}\triangledown_{x_t}\mathcal{L}_{acu} - \rho_{dist}\triangledown_{x_t}\mathcal{L}_{dist} - \rho_{adv}\triangledown_{x_t}\mathcal{L}_{adv}

where ρacu=4\rho_{acu}=4, ρdist=20\rho_{dist}=20, and ρadv=0.3\rho_{adv}=0.3 are the tradeoff parameters determined through experimental studies.

Experiments on the HPDv2 dataset demonstrate that FreeEnhance outperforms state-of-the-art image enhancement models in quantitative metrics and human preference, even surpassing the commercial solution of Magnific AI. The Human Preference Score v2 (HPSv2) metric shows FreeEnhance achieves a score of 29.32 without parameter tuning.

Ablation studies validate the contribution of each component of FreeEnhance, including the noising stage and the three denoising constraints. The framework also shows generalization capabilities in text-to-image generation and natural image enhancement. When tested in a text-to-image generation scenario, FreeEnhance achieved the highest HPSv2 score of 25.26 using Stable Diffusion 1.5.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.