Papers
Topics
Authors
Recent
2000 character limit reached

AquaDiff: Diffusion Models in Image & Flow Simulation

Updated 22 December 2025
  • AquaDiff is a dual-purpose framework that applies diffusion principles to both underwater image enhancement and multiphase flow simulation.
  • It employs a conditional generative denoising diffusion process with an advanced U-Net architecture to restore natural colors and contrasts in aquatic visuals.
  • In fluid dynamics, AquaDiff uses a diffuse interface (CHNS-type) method with adaptive, energy-stable schemes to accurately model air–sea interfaces.

AquaDiff denotes two distinct but high-impact methodological frameworks in computational science: (i) a class of diffusion-based models for underwater image enhancement that address chromatic and perceptual degradations in aquatic visual data, and (ii) a family of diffuse interface (CHNS-type) algorithms for simulating air–water interfaces in geophysical and environmental fluid dynamics. Both share nomenclature but pertain to different domains—image restoration and multiphase flow modeling—though both leverage diffusion principles at their core.

1. Diffusion-Based Underwater Image Enhancement

Physical Degradation Model

AquaDiff for image enhancement is founded on the Jaffe–McGlamery degradation model, characterizing underwater image formation with per-channel exponential absorption:

Iλ(x)=Bλ(x)eηd(x)+Jλ(x)[1eηd(x)]I_\lambda(x) = B_\lambda(x)\,e^{-\eta\,d(x)} + J_\lambda(x)[1-e^{-\eta\,d(x)}]

where

  • Iλ(x)I_\lambda(x): observed underwater intensity at pixel xx and channel λ{R,G,B}\lambda\in\{R,G,B\},
  • Bλ(x)B_\lambda(x): scene radiance,
  • Jλ(x)J_\lambda(x): background light,
  • η\eta: wavelength-dependent attenuation,
  • d(x)d(x): scene distance per pixel.

The restoration objective is estimation of B^(x)\hat B(x) reflecting natural color, contrast, and detail (Shaahid et al., 15 Dec 2025).

Diffusion Process for Restoration

AquaDiff frames enhancement as a conditional generative denoising diffusion process. The forward process gradually corrupts clean reference x0x_0 with noise to generate a Markov chain x1,,xTx_1,\ldots,x_T:

q(xtxt1)=N(xt;1βtxt1,βtI)q(x_t|x_{t-1}) = \mathcal{N}(x_t;\sqrt{1-\beta_t}\,x_{t-1},\beta_t I)

In reverse, a neural network fθf_\theta (a U-Net backbone) learns to iteratively denoise xtx_t, conditioned on a pre-processed input yy through

pθ(xt1xt,y)=N(xt1;μθ(xt,y,t),σt2I)p_\theta(x_{t-1}|x_t,y) = \mathcal{N}(x_{t-1};\mu_\theta(x_t, y, t), \sigma_t^2 I)

where μθ\mu_\theta is parametrized to predict the mean from fθf_\theta's output.

2. Chromatic Prior–Guided Color Compensation

Color distortion in underwater environments is addressed via a physics-guided compensation prior in the CIE Lab color space. Chroma channels aa^*, bb^* are adaptively de-biased as:

Iac(x)=Ia(x)κM(x)G[Ia(x)],Ibc(x)=Ib(x)λM(x)G[Ib(x)]I_a^c(x) = I_a(x) - \kappa M(x) G[I_a(x)], \qquad I_b^c(x) = I_b(x) - \lambda M(x) G[I_b(x)]

with M(x)M(x) a mask thresholding high-luminance regions and G[]G[\cdot] a Gaussian blur; κ,λ0.7\kappa, \lambda \approx 0.7. Elementwise compensation is compactly written as:

y=Lab1(L,cΛMG[c])\mathbf{y} = \mathrm{Lab}^{-1}(L, \mathbf{c} - \mathbf{\Lambda} \odot M \odot G[\mathbf{c}])

where Λ=diag(κ,λ)\mathbf{\Lambda} = \mathrm{diag}(\kappa,\lambda), targeting compensation where needed and avoiding overcorrection in high-brightness regions (Shaahid et al., 15 Dec 2025).

3. Conditional Diffusion Neural Architecture

The denoising backbone underlying AquaDiff is an enhanced U-Net, featuring:

  • Channel multipliers {1,2,4,8,16}\{1,2,4,8,16\} on a 64-dim base.
  • Residual dense blocks to optimize feature reuse.
  • Dense skip connections (U-Net++ paradigm) for enriched multi-scale information flow.
  • Multi-resolution self-attention at 16×1616\times16 and 32×3232\times32 feature resolutions to capture nonlocal chromatic relationships.

Key to the model is the cross-attention fusion mechanism, where at each time step, degraded input and current noisy state interact via:

CrossAtt(xt,y)=Softmax ⁣(Q(xt)K(y)dk)V(y)\mathrm{CrossAtt}(x_t, y) = \mathrm{Softmax}\!\left(\frac{Q(x_t) K(y)^\top}{\sqrt{d_k}}\right) V(y)

with QQ, KK, VV being trainable projections. This enables the model to dynamically guide denoising using color-compensated cues (Shaahid et al., 15 Dec 2025).

4. Cross-Domain Consistency Loss

The training objective of AquaDiff is a composite loss, enforcing:

  • Pixel-wise fidelity: 1\ell_1 norm on (x^0\hat x_0, x0x_0)
  • Multi-scale structure: sum of pixel losses at 0.5×0.5\times and 0.25×0.25\times resolutions
  • Perceptual similarity: VGG-19 feature differences, weighted empirically ($1:0.5:0.1$) across layers 2, 7, 16
  • Structural similarity: 1SSIM(x^0,x0)1-\mathrm{SSIM}(\hat x_0, x_0)
  • Frequency-domain consistency: 1\ell_1 between FFT magnitudes of output and target

Aggregated as:

LCDC=1HWCx^0x01+sHWHsWsDs(x^0)Ds(x0)1+lwlϕl(x^0)ϕl(x0)22+[1SSIM(x^0,x0)]+1HWF(x^0)F(x0)1\mathcal{L}_\mathrm{CDC} = \frac{1}{HWC} \|\hat x_0 - x_0\|_1 + \sum_s \frac{HW}{H_s W_s} \|D_s(\hat x_0) - D_s(x_0)\|_1 + \sum_l w_l \| \phi_l(\hat x_0) - \phi_l(x_0) \|_2^2 + [1 - \mathrm{SSIM}(\hat x_0, x_0)] + \frac{1}{HW'} \| |\mathcal{F}(\hat x_0)| - |\mathcal{F}(x_0)| \|_1

This enforces alignment across pixel, perceptual, structural, and spectral domains, suppressing color artifacts and over-smoothing typical of diffusion models (Shaahid et al., 15 Dec 2025).

5. Empirical Evaluation and Benchmarks

AquaDiff is trained with:

  • LSUI (5004 pairs), UIEB (800 train/90 val)
  • Patch size 256×256256 \times 256, 2000 diffusion steps, Adam optimizer with learning rate 3×1063\times 10^{-6}, batch size 1.

Quantitative performance is reported on full- and no-reference benchmarks (PSNR, SSIM, UIQM, UCIQE) across U45, S16, C60, TEST-U90 datasets. Notably, AquaDiff attains the highest UCIQE (colorfulness and contrast) on all non-reference sets and competitive PSNR/SSIM. Ablation demonstrates cumulative gains from the chromatic prior, enhanced U-Net, and CDC loss:

Method U45 UIQM/UCIQE S16 UIQM/UCIQE C60 UIQM/UCIQE TEST-U90 PSNR/SSIM
UDCP 3.30/0.455 1.49/0.443 2.73/0.393 11.38/0.516
Water-Net 4.86/0.450 3.50/0.431 4.45/0.442 19.92/0.833
Ucolor 4.95/0.446 3.58/0.419 4.33/0.385 21.00/0.869
DiffWater 4.73/0.462 4.52/0.450 4.66/0.434 20.97/0.895
AquaDiff 4.61/0.539 4.44/0.524 4.32/0.518 20.25/0.883

Qualitative analysis confirms strong suppression of blue, green-yellow, and red casts and preservation of texture under varying illumination (Shaahid et al., 15 Dec 2025).

6. Diffuse-Interface Framework in Atmosphere–Ocean Systems

AquaDiff also denotes a diffuse interface methodology for air–sea interactions, modeled by variable-density Cahn–Hilliard–Navier–Stokes (CHNS) systems:

Governing Equations:

  • Momentum (Navier–Stokes):

ρ(φ)tv+[ρ(φ)v+J]v[2η(φ)Dv]+p=μφ+ρ(φ)g\rho(\varphi)\partial_t v + [\rho(\varphi)v + J] \cdot \nabla v - \nabla\cdot[2\eta(\varphi) Dv] + \nabla p = \mu\nabla\varphi + \rho(\varphi)g

  • Incompressibility:

v=0\nabla \cdot v = 0

  • Phase-field advection–diffusion:

tφ+vφ=[m(φ)μ]\partial_t \varphi + v\cdot\nabla\varphi = \nabla\cdot[m(\varphi)\nabla\mu]

  • Chemical potential:

μ=σϵΔφ+σϵF(φ)\mu = -\sigma\epsilon\Delta\varphi + \frac{\sigma}{\epsilon} F'(\varphi)

Adaptive, energy-stable schemes with a posteriori error estimation and dynamic mesh refinement are implemented to resolve sharp density and velocity gradients at wind-driven interfaces (Garcke et al., 2016).

7. Algorithmic, Implementation, and Applications Context

For image enhancement, AquaDiff is implemented as a deep learning pipeline with U-Net backbone, requiring no augmentation beyond cropping/flip. For geophysical flows, the finite-element implementation leverages iFEM mesh management, semi-smooth Newton linearization, Krylov/preconditioned linear solvers, and adaptive mesh refinement. Example simulations include adaptive wind–wave generation resolving up to 25,000 DoFs and capturing dynamic topological changes at air–sea boundaries (Garcke et al., 2016).

In both domains, "AquaDiff" frameworks leverage diffusion processes—either as the factorial mechanism for denoising and structural restoration (vision) or as a mesoscopic approximation to multiphase interface physics (fluid dynamics). The significance of both approaches lies in state-of-the-art restoration of underwater imagery and in efficient, accurate modeling of dynamic interfacial flows, respectively.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to AquaDiff.