UD-SfPNet: End-to-End Underwater 3D Reconstruction
- The paper introduces UD-SfPNet, a unified framework that integrates descattering and shape-from-polarization to achieve state-of-the-art underwater 3D normal reconstruction.
- It leverages a Polarization Parameter Network, Descattering Network, and Normal Estimation Network with Detail-Enhanced Convolutions to reduce angular error on the MuS-Polar3D dataset.
- Experimental evaluations show enhanced PSNR, SSIM, and LPIPS metrics, demonstrating the practical capability to capture fine geometric details in turbid underwater environments.
UD-SfPNet is an end-to-end neural framework for underwater 3D surface normal reconstruction that jointly models descattering and shape-from-polarization (SfP) inference, leveraging polarization cues acquired via division-of-focal-plane (DoFP) polarimetric imaging. The architecture integrates polarization physics with advanced deep-learning modules to achieve state-of-the-art accuracy in challenging turbid underwater environments. UD-SfPNet was introduced to avoid error accumulation inherent in sequential (cascaded) descattering-then-reconstruction pipelines by enabling global optimization across both tasks, outperforming previous methods in terms of mean surface normal angular error on the MuS-Polar3D dataset (Wang et al., 1 Mar 2026).
1. Motivation and Background
Underwater optical imaging is fundamentally limited by particulate scattering (notably Mie scattering), leading to blur, contrast loss, and noise that traditional intensity- or spectrum-based dehazing methods cannot adequately resolve, particularly when backscatter and target radiance are comparable. Polarization imaging distinguishes itself by exploiting the differential polarization states of backscattered and target-reflected light—captured via a DoFP sensor at 0°, 45°, 90°, and 135°—which enables both descattering and direct extraction of geometric surface orientation cues via the degree and angle of polarization.
Typical cascaded approaches, in which descattering precedes SfP normal estimation, propagate irrecoverable errors from the first stage. By contrast, UD-SfPNet unifies both tasks in a single global optimization framework. Loss functions from both low-level (descattering) and high-level (normal estimation) objectives co-regulate the pipeline, ensuring the preservation of fine geometric information and substantially mitigating error accumulation.
2. Network Architecture
UD-SfPNet comprises three interacting modules: the Polarization Parameter Network (PPN), Descattering Network (DN), and Normal Estimation Network (NEN), with auxiliary components designed for geometric and color consistency.
- Polarization Parameter Network (PPN):
- Inputs: Degree of polarization (), angle of polarization (), specular () and diffuse () image components, extracted from processed Stokes parameters.
- Outputs: A high-dimensional ‘normal feature’ (NF) and a 64-bin global normal-orientation histogram. The PPN regularizes local predictions with a global normal prior encoded in its output.
- Descattering Network (DN):
- Architecture: A U-Net variant (4-level encoding/decoding with skip connections) in which all convolutions are replaced by Detail-Enhanced Convolutions (DEConv). The DN processes raw scattered polarization images , outputting descattered images .
- Losses: pixel loss, SSIM structural loss, TV regularization, and perceptual (LPIPS) loss, all masked to the target region.
- Normal Estimation Network (NEN):
- Inputs: NF from the PPN and from the DN.
- Architecture: Shared encoder, multi-head attention bottleneck, two decoder branches—one focused on polarization cues, the other embedding a Pyramid Color Embedding (PCE) module for channel–orientation consistency. All convolutions utilize DEConv for enhanced high-frequency detail preservation.
- Output: Predicted normal map , supervised by a cosine similarity-based angular error loss.
Information Flow and Optimization
The NF acts as a global polarization prior, guiding NEN’s local normal predictions. Descattered images and polarization-derived features are jointly processed. All loss terms are summed into a unified objective; back-propagation updates all sub-networks simultaneously, enforcing cross-stage consistency.
3. Mathematical Modeling
3.1 Underwater Scattering and Descattering
For each Stokes channel , underwater image formation is modeled as:
with as the unattenuated target signal and as additive backscatter. The DN learns an implicit inversion of this relationship under supervision.
3.2 Polarization and Surface Geometry
Stokes parameters are computed as: Degree and angle of polarization: For specular reflection with refractive index : Intensity as a function of polarizer rotation : Given , two ambiguous solutions exist for zenith () and azimuth () angles, yielding the local normal:
3.3 Joint Loss Function
The total training loss is
with empirical weights .
4. Implementation and Ablation
- Dataset: MuS-Polar3D (726 samples, 80%/10%/10% train/val/test split).
- Infrastructure: PyTorch on 4×NVIDIA A100 GPUs; 1000 training epochs; Adam optimizer, initial LR=0.001.
- Augmentation: Random crops (with foreground), horizontal flipping.
- Inference: Sliding-window tiling, overlap stitching.
Ablation on the MuS-Polar3D test set highlights the importance of each module:
| Component Removed | Mean MAE (°) | Median MAE (°) |
|---|---|---|
| w/o PPN | 16.72 | 15.94 |
| w/o DN | 15.37 | 15.38 |
| w/o PPN & DN | 15.56 | 16.09 |
| w/o Color Embedding | 15.46 | 15.73 |
| w/o DEConv | 23.03 | 22.48 |
| Full UD-SfPNet (proposed) | 15.12 | 15.21 |
The DEConv module is especially impactful on angular error.
5. Experimental Evaluation
5.1 Quantitative Metrics
- Descattering performance: PSNR improves from 30.80 (raw) to 36.87, SSIM from 0.9569 to 0.9745, LPIPS from 0.3830 to 0.0356.
- Surface normal estimation (Mean Angular Error, MuS-Polar3D test set):
- DeepSfP (2020): 19.64°
- SfP-wild (2022): 21.64°
- TransSfP (2023): 20.54°
- AttentionU²-Net (2025): 15.72°
- DSINE (2024): 16.94°
- UD-SfPNet: 15.12° (lowest)
5.2 Qualitative Results
Error heatmaps reveal that UD-SfPNet distributes errors more evenly, with suppressed errors in high-curvature and fine-detail regions compared to oversmoothing in prior methods. Reconstruction of 3D surfaces via normal integration captures detailed textures and geometry even under varying levels of water turbidity.
6. Key Insights and Applications
UD-SfPNet demonstrates that end-to-end joint modeling of polarization-based descattering and shape inference achieves superior 3D normal recovery, attributed to:
- Polarization uniquely enables the separation of backscatter from object signal and provides robust surface normal cues.
- Global prior (from PPN) regularizes per-pixel predictions.
- Color embedding enforces cross-channel (RGB/orientation) geometric consistency.
- DEConv modules enhance high-frequency detail retention, essential under scattering.
Applications extend to underwater robotics (infrastructure inspection, maintenance), marine archaeology, biological imaging (e.g., coral morphology), and environmental monitoring (seabed mapping, coral health)—any scenario demanding high-resolution 3D recovery in turbid water.
UD-SfPNet is the first framework to achieve end-to-end, physically grounded, and geometry-aware underwater 3D imaging via polarization (Wang et al., 1 Mar 2026).