DichroGAN: Seafloor Color Restoration cGAN

Updated 8 January 2026

DichroGAN is a conditional generative adversarial network that restores true, dewatered seafloor colors by compensating for depth-dependent spectral distortions.
The architecture integrates four specialized generators with a U-Net style encoder–decoder and a ViT-based discriminator to disentangle diffuse, specular, and transmission components.
Quantitative evaluations demonstrate superior SSIM and PSNR compared to state-of-the-art methods, highlighting its potential for robust underwater image reconstruction.

DichroGAN is a conditional generative adversarial network (cGAN) designed for the restoration of true in-air seafloor colors from satellite imagery, effectively compensating for the severe, depth-dependent spectral distortions imposed by the water column. The methodology integrates a physically motivated underwater image formation equation with a four-generator architecture, explicitly disentangling diffuse and specular reflectance, transmission, and veiling light, to produce accurate "dewatered" (in-air) radiance estimates. DichroGAN is trained and validated on PRISMA satellite hyperspectral data and demonstrates quantitative and qualitative improvements over state-of-the-art underwater image restoration techniques (Gonzalez-Sabbagh et al., 1 Jan 2026).

1. Physical Modeling: Underwater Image Formation

DichroGAN is grounded in a physically explicit model of underwater radiative transfer, primarily using Duntley’s underwater image formation model (UIFM). For each spectral band $\lambda_i$ , the observed radiance $N$ is:

$\begin{aligned} N(z,\theta,\phi,\lambda_i) &= J(z,\theta,\phi,\lambda_i)\,\exp[-\alpha(z,\lambda_i)\,r] \ &+V(z,\theta,\phi,\lambda_i)\,\exp[K(z,\theta,\phi,\lambda_i)\,r\cos\theta] \ &\qquad\times\left\{1-\exp\left[-\alpha(z,\lambda_i)\,r+K(z,\theta,\phi,\lambda_i)\,r\cos\theta\right]\right\} \end{aligned}$

where:

$N$ : observed radiance at the sensor
$J$ : true object radiance (just above the seafloor)
$V$ : veiling light from water column backscatter
$\alpha(z,\lambda) = a(z,\lambda) + b(z,\lambda)$ : total attenuation (absorption + scattering)
$K$ : diffuse attenuation coefficient
$r$ : range from sensor to seafloor
$(\theta, \phi)$ : viewing angles

Under a nadir view, this equation simplifies (per-pixel) to:

$\mathbf{N}(u,\lambda_i) = \mathbf{J}(u,\lambda_i)\,\mathbf{T}(u,\lambda_i) + \mathbf{V}(u,\lambda_i)\left[1-\mathbf{T}(u,\lambda_i)\right]$

with transmission

$\mathbf{T}(u,\lambda_i) = \exp[-r\,\alpha(z,\lambda_i)]$

Recovery of the "in-air" color requires inverting this mapping:

$\mathbf{J}(u,\lambda_i) = \frac{\mathbf{N}(u,\lambda_i)-\mathbf{V}(u,\lambda_i)}{\mathbf{T}(u,\lambda_i)} + \mathbf{V}(u,\lambda_i)$

Thus, accurate estimation of $V(u,\lambda_i)$ and $T(u,\lambda_i)$ is essential to reconstruct seafloor color.

2. Network Architecture and Component Functions

DichroGAN’s architecture comprises four generators and a single discriminator, organized within a unified conditional GAN:

$G_d$ (diffuse-reflectance generator): Estimates diffuse reflectance component from RGB satellite input.
$G_s$ (specular-reflectance generator): Estimates specular reflectance component.
$G_t$ (transmission/depth generator): Predicts per-pixel transmission map, representing attenuation due to water depth.
$G_j$ (radiance-restoration generator): Synthesizes the final "dewatered" in-air RGB output, combining the preceding estimations.

All generators utilize a U-Net style encoder–decoder structure:

Encoder: ResNet-50 pretrained on ImageNet
Decoder: Five upsampling blocks (feature map sizes: [256, 128, 64, 32, 16])
Skip connections between encoder and decoder layers

The discriminator is a Vision Transformer (ViT)-based patch-level classifier that evaluates (input, output) pairs for adversarial training.

The comprehensive workflow is as follows:

$G_d$ and $G_s$ take the same RGB+mask inputs—each outputs $\hat{n}_d$ and $\hat{n}_s$ (diffuse/specular).
$G_t$ outputs the transmission/depth map $\hat{t}$ .
$G_j$ receives the sum $\hat{r} = \hat{n}_d + \hat{n}_s$ and transmission $\hat{t}$ (plus a Grey-World estimate for veiling) and outputs the dewatered RGB $\hat{y}$ .

3. Loss Functions and Joint Objective

DichroGAN’s training strategy combines adversarial and physically informed losses:

Adversarial loss (cGAN):

$\mathcal{L}_{cGAN} = \mathbb{E}_{x,y}[\log D(x,y)] + \mathbb{E}_{x}[\log(1-D(x,G_j(\hat{y}_r,\hat{t})))]$

Dichromatic model-based reflectance decomposition:
- Diffuse: $\mathcal{L}_{gd} = \|L_{gw}(\lambda)gS(u,\lambda)-\hat{y}_d\|_1$
- Specular: $\mathcal{L}_{gs} = \|k(u)L_{gw}(\lambda)-\hat{y}_s\|_1$
- Full radiance recon.: $\mathcal{L}_r = \|I(u,\lambda)-(G_d(x)+G_s(x))\|_2$ (water-masked)
Transmission/depth regularization:
- $L_1$ transmission: $\mathcal{L}_{t_1} = \|T(u,\lambda)-\hat{y}_t\|_1$
- Scale-invariant smoothness: $\mathcal{L}_{t_2} = \frac{1}{n}\sum_i |\nabla_x\log\hat{y}_{t,i}| + |\nabla_y\log\hat{y}_{t,i}|$
Radiance-restoration loss: $L_1$ penalty on in-air RGB estimate, masked to water pixels: $\mathcal{L}_{gj}$
UIFM consistency ("pseudo-reprojection"):

$\mathcal{L}_N = \|N(u,\lambda)-\hat{N}(u,\lambda)\|_1,\;\;\; \hat{N}=\hat{y}_j\,\hat{t}+V_{gw}(1-\hat{t})$

These loss terms are combined as:

$\begin{aligned} \mathcal{L}_{obj} &= \min_{G_d,G_s,G_t,G_j} \max_{D} \;\mathcal{L}_{cGAN} +\gamma\,(\mathcal{L}_{gd}+\mathcal{L}_{gs}) +\sigma\,\mathcal{L}_r +\iota\,\mathcal{L}_{gj} +\tau\,(\mathcal{L}_{t_1}+\mathcal{L}_{t_2}) +\nu\,\mathcal{L}_N\ &\text{with }\gamma=30,\;\sigma=90,\;\iota=100,\;\tau=50,\;\nu=10 \end{aligned}$

Losses involving physical or radiometric quantities are only applied within water-masked regions to avoid bias from land/cloud pixels.

4. Dataset Construction and Preprocessing

DichroGAN is trained on data derived from PRISMA Level-2 VNIR hyperspectral cubes, which provide:

63 bands spanning 400–1010 nm at 30 m GSD.
RGB synthesis: bands 33 (R), 45 (G), 56 (B).
Binary water-vs.-nonwater masks via automatic thresholding on NIR bands.

The training corpus includes:

1,570 unique RGB scenes with all 63 spectral bands ( $\approx$ 98,000 image slices).
All images (input/output) normalized to $[0,1]$ and resized to $256\times256$ .
Histogram stretch applied to diffuse/specular outputs for improved dynamic range.
No explicit geometric or photometric augmentation apart from random seed specification.

5. Quantitative Results and Comparative Analysis

DichroGAN’s performance is ascertained through detailed ablations and benchmarks, summarized as follows:

Model	SSIM	PSNR (dB)
cGANs-baseline	0.593	17.75
cWGAN-2	0.524	17.96
cGANs-G_t	0.582	17.78
cGAN-VGG	0.489	16.34
DichroGAN (proposed)	0.672	18.01

On full-reference NASA EO data, DichroGAN achieves the highest SSIM and PSNR scores among tested architectures.

A comparable pattern holds in benchmark comparisons with classical and modern underwater restoration methods (UDCP, CWR, NU²Net, Phaseformer):

On NASA EO: DichroGAN achieves SSIM 0.560, PSNR 14.39 dB (highest PSNR).
On combined PRISMA + NASA EO (no-reference): CCF 18.84, UIQM 2.342, NIQE 5.422 (2nd on CCF/NIQE across methods).
On HICRD & UIEB (underwater benchmarks): best NIQE, competitive CCF/UIQM.

Qualitative analysis shows DichroGAN restores terrain detail and color without over-enhancement or color cast, in contrast to existing methods that often introduce artifacts or fail to remove water silhouettes.

6. Implementation Details and Limitations

Framework: PyTorch, running on AMD EPYC 7402P CPU with 60 GB RAM.
Hyperparameters: Batch size 6, epochs 130, learning rate $2 \times 10^{-4}$ , Adam optimizer with $\beta_1=0.5,\;\beta_2=0.999$ .
Initialization: Generator encoders from ResNet-50 pretrained on ImageNet, with $G_j$ and $G_t$ warm-started from a scene-depth model [González-Sabbagh et al., 2025].
Input/output sizes: All nets work on $3\times256\times256$ images (except $G_t$ output $1\times256\times256$ ).
Masking: All key losses are masked to water pixels.
Limitations: Tendency for slight blurriness (loss of high-frequency detail), somewhat lower performance on metrics biased toward over-enhancement. Future work includes scaling dataset size, multi-term perceptual/texture losses, and evolving to higher-resolution architectures.

7. Significance, Outlook, and Generalization

DichroGAN enables explicit, physically grounded restoration of in-air radiance from satellite images of the seafloor, integrating the dichromatic reflection model and Duntley’s UIFM into a unified deep generative framework. This architecture achieves state-of-the-art restoration across reference and non-reference benchmarks. A plausible implication is that the explicit disentanglement of reflectance, transmission, and veiling light within a deep learning model is critical to robust underwater image reconstruction, and similar frameworks may be extensible to other remote sensing or atmospheric correction domains (Gonzalez-Sabbagh et al., 1 Jan 2026).

PDF Markdown Chat (Pro)

References (1)

DichroGAN: Towards Restoration of in-air Colours of Seafloor from Satellite Imagery (2026)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to DichroGAN.

DichroGAN: Seafloor Color Restoration cGAN

1. Physical Modeling: Underwater Image Formation

2. Network Architecture and Component Functions

3. Loss Functions and Joint Objective

4. Dataset Construction and Preprocessing

5. Quantitative Results and Comparative Analysis

6. Implementation Details and Limitations

7. Significance, Outlook, and Generalization

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

DichroGAN: Seafloor Color Restoration cGAN

1. Physical Modeling: Underwater Image Formation

2. Network Architecture and Component Functions

3. Loss Functions and Joint Objective

4. Dataset Construction and Preprocessing

5. Quantitative Results and Comparative Analysis

6. Implementation Details and Limitations

7. Significance, Outlook, and Generalization

Sponsor

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research