UD-SfPNet: End-to-End Underwater 3D Reconstruction

Updated 5 March 2026

The paper introduces UD-SfPNet, a unified framework that integrates descattering and shape-from-polarization to achieve state-of-the-art underwater 3D normal reconstruction.
It leverages a Polarization Parameter Network, Descattering Network, and Normal Estimation Network with Detail-Enhanced Convolutions to reduce angular error on the MuS-Polar3D dataset.
Experimental evaluations show enhanced PSNR, SSIM, and LPIPS metrics, demonstrating the practical capability to capture fine geometric details in turbid underwater environments.

UD-SfPNet is an end-to-end neural framework for underwater 3D surface normal reconstruction that jointly models descattering and shape-from-polarization (SfP) inference, leveraging polarization cues acquired via division-of-focal-plane (DoFP) polarimetric imaging. The architecture integrates polarization physics with advanced deep-learning modules to achieve state-of-the-art accuracy in challenging turbid underwater environments. UD-SfPNet was introduced to avoid error accumulation inherent in sequential (cascaded) descattering-then-reconstruction pipelines by enabling global optimization across both tasks, outperforming previous methods in terms of mean surface normal angular error on the MuS-Polar3D dataset (Wang et al., 1 Mar 2026).

1. Motivation and Background

Underwater optical imaging is fundamentally limited by particulate scattering (notably Mie scattering), leading to blur, contrast loss, and noise that traditional intensity- or spectrum-based dehazing methods cannot adequately resolve, particularly when backscatter and target radiance are comparable. Polarization imaging distinguishes itself by exploiting the differential polarization states of backscattered and target-reflected light—captured via a DoFP sensor at 0°, 45°, 90°, and 135°—which enables both descattering and direct extraction of geometric surface orientation cues via the degree and angle of polarization.

Typical cascaded approaches, in which descattering precedes SfP normal estimation, propagate irrecoverable errors from the first stage. By contrast, UD-SfPNet unifies both tasks in a single global optimization framework. Loss functions from both low-level (descattering) and high-level (normal estimation) objectives co-regulate the pipeline, ensuring the preservation of fine geometric information and substantially mitigating error accumulation.

2. Network Architecture

UD-SfPNet comprises three interacting modules: the Polarization Parameter Network (PPN), Descattering Network (DN), and Normal Estimation Network (NEN), with auxiliary components designed for geometric and color consistency.

Polarization Parameter Network (PPN):
- Inputs: Degree of polarization ( $\rho$ ), angle of polarization ( $\phi$ ), specular ( $I^S$ ) and diffuse ( $I^D$ ) image components, extracted from processed Stokes parameters.
- Outputs: A high-dimensional ‘normal feature’ (NF) and a 64-bin global normal-orientation histogram. The PPN regularizes local predictions with a global normal prior encoded in its output.
Descattering Network (DN):
- Architecture: A U-Net variant (4-level encoding/decoding with skip connections) in which all convolutions are replaced by Detail-Enhanced Convolutions (DEConv). The DN processes raw scattered polarization images $I_{sc}$ , outputting descattered images $I_{desc}$ .
- Losses: $L_1$ pixel loss, SSIM structural loss, TV regularization, and perceptual (LPIPS) loss, all masked to the target region.
Normal Estimation Network (NEN):
- Inputs: NF from the PPN and $I_{desc}$ from the DN.
- Architecture: Shared encoder, multi-head attention bottleneck, two decoder branches—one focused on polarization cues, the other embedding a Pyramid Color Embedding (PCE) module for channel–orientation consistency. All convolutions utilize DEConv for enhanced high-frequency detail preservation.
- Output: Predicted normal map $N_{pre}$ , supervised by a cosine similarity-based angular error loss.

Information Flow and Optimization

The NF acts as a global polarization prior, guiding NEN’s local normal predictions. Descattered images and polarization-derived features are jointly processed. All loss terms are summed into a unified objective; back-propagation updates all sub-networks simultaneously, enforcing cross-stage consistency.

3. Mathematical Modeling

3.1 Underwater Scattering and Descattering

For each Stokes channel $S_0$ , underwater image formation is modeled as:

$S_0(x, y) = T(x, y) + B(x, y)$

with $T$ as the unattenuated target signal and $B$ as additive backscatter. The DN learns an implicit inversion of this relationship under supervision.

3.2 Polarization and Surface Geometry

Stokes parameters are computed as: $S_0 = I_{0^\circ} + I_{90^\circ}, \quad S_1 = I_{0^\circ} - I_{90^\circ}, \quad S_2 = I_{45^\circ} - I_{135^\circ}$ Degree and angle of polarization: $\rho = \frac{\sqrt{S_1^2 + S_2^2}}{S_0}, \quad \phi = \frac{1}{2} \arctan \left(\frac{S_2}{S_1}\right)$ For specular reflection with refractive index $\eta$ : $\rho = \frac{2\sin^2\theta\,\cos\theta\,\sqrt{\eta^2-\sin^2\theta}} {\eta^2-\sin^2\theta - \eta^2\sin^2\theta + 2\sin^4\theta}$ Intensity as a function of polarizer rotation $\varphi$ : $I(\varphi) = \frac{I_{max} + I_{min}}{2} + \frac{I_{max} - I_{min}}{2}\cos \bigl(2\varphi - 2\phi \bigr)$ Given $(\rho, \phi)$ , two ambiguous solutions exist for zenith ( $\theta$ ) and azimuth ( $\alpha$ ) angles, yielding the local normal: $\mathbf{n} = [\sin \theta \cos \alpha, \sin \theta \sin \alpha, \cos \theta]^\mathsf{T}$

3.3 Joint Loss Function

The total training loss is

$\mathcal{L}_{total} = \lambda_1 \mathcal{L}_{hist} + \lambda_2 \mathcal{L}_{L1} + \lambda_3 \mathcal{L}_{SSIM} + \lambda_4 \mathcal{L}_{TV} + \lambda_5 \mathcal{L}_{LPIPS} + \lambda_6 \mathcal{L}_{normal}$

with empirical weights $\lambda_1=1.0, \lambda_2=10.0, \lambda_3=1.0, \lambda_4=10.0, \lambda_5=2.0, \lambda_6=30.0$ .

4. Implementation and Ablation

Dataset: MuS-Polar3D (726 samples, 80%/10%/10% train/val/test split).
Infrastructure: PyTorch on 4×NVIDIA A100 GPUs; 1000 training epochs; Adam optimizer, initial LR=0.001.
Augmentation: Random $256\times256$ crops (with $\geq50\%$ foreground), horizontal flipping.
Inference: Sliding-window tiling, overlap stitching.

Ablation on the MuS-Polar3D test set highlights the importance of each module:

Component Removed	Mean MAE (°)	Median MAE (°)
w/o PPN	16.72	15.94
w/o DN	15.37	15.38
w/o PPN & DN	15.56	16.09
w/o Color Embedding	15.46	15.73
w/o DEConv	23.03	22.48
Full UD-SfPNet (proposed)	15.12	15.21

The DEConv module is especially impactful on angular error.

5. Experimental Evaluation

5.1 Quantitative Metrics

Descattering performance: PSNR improves from 30.80 (raw) to 36.87, SSIM from 0.9569 to 0.9745, LPIPS from 0.3830 to 0.0356.
Surface normal estimation (Mean Angular Error, MuS-Polar3D test set):
- DeepSfP (2020): 19.64°
- SfP-wild (2022): 21.64°
- TransSfP (2023): 20.54°
- AttentionU²-Net (2025): 15.72°
- DSINE (2024): 16.94°
- UD-SfPNet: 15.12° (lowest)

5.2 Qualitative Results

Error heatmaps reveal that UD-SfPNet distributes errors more evenly, with suppressed errors in high-curvature and fine-detail regions compared to oversmoothing in prior methods. Reconstruction of 3D surfaces via normal integration captures detailed textures and geometry even under varying levels of water turbidity.

6. Key Insights and Applications

UD-SfPNet demonstrates that end-to-end joint modeling of polarization-based descattering and shape inference achieves superior 3D normal recovery, attributed to:

Polarization uniquely enables the separation of backscatter from object signal and provides robust surface normal cues.
Global prior (from PPN) regularizes per-pixel predictions.
Color embedding enforces cross-channel (RGB/orientation) geometric consistency.
DEConv modules enhance high-frequency detail retention, essential under scattering.

Applications extend to underwater robotics (infrastructure inspection, maintenance), marine archaeology, biological imaging (e.g., coral morphology), and environmental monitoring (seabed mapping, coral health)—any scenario demanding high-resolution 3D recovery in turbid water.

UD-SfPNet is the first framework to achieve end-to-end, physically grounded, and geometry-aware underwater 3D imaging via polarization (Wang et al., 1 Mar 2026).

Markdown Report Issue Upgrade to Chat

References (1)

UD-SfPNet: An Underwater Descattering Shape-from-Polarization Network for 3D Normal Reconstruction (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to UD-SfPNet.

UD-SfPNet: End-to-End Underwater 3D Reconstruction

1. Motivation and Background

2. Network Architecture

Information Flow and Optimization

3. Mathematical Modeling

3.1 Underwater Scattering and Descattering

3.2 Polarization and Surface Geometry

3.3 Joint Loss Function

4. Implementation and Ablation

5. Experimental Evaluation

5.1 Quantitative Metrics

5.2 Qualitative Results

6. Key Insights and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

UD-SfPNet: End-to-End Underwater 3D Reconstruction

1. Motivation and Background

2. Network Architecture

Information Flow and Optimization

3. Mathematical Modeling

3.1 Underwater Scattering and Descattering

3.2 Polarization and Surface Geometry

3.3 Joint Loss Function

4. Implementation and Ablation

5. Experimental Evaluation

5.1 Quantitative Metrics

5.2 Qualitative Results

6. Key Insights and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research