Multi-Scale Fusion Convexity Prior (MFCP)
- The paper introduces MFCP as a multi-scale fusion strategy that fuses azimuth information to enhance surface normal estimation in shape-from-polarization.
- MFCP employs multi-scale block decomposition, gamma-correction, and variance-weighted fusion to balance global structure with local detail.
- Experimental results show significant accuracy improvements, with mean angular error reductions and increased pixel accuracy on benchmark datasets.
The Multi-Scale Fusion Convexity Prior (MFCP) is a physically motivated soft constraint on local surface convexity, designed to enhance monocular shape-from-polarization (SfP) by fusing azimuth angle information at multiple spatial scales within locally convex regions. MFCP forms a central component of the Segmentation-Driven Monocular SfP (SMSfP) framework, addressing the intrinsic azimuth ambiguity of polarization analysis and yielding improved surface normal recovery that preserves both global structure and fine-scale texture (Zhang et al., 8 Jan 2026).
1. Mathematical Definition and Formulation
MFCP operates on two per-pixel azimuth angle fields: estimated directly from polarization, and obtained by convex-boundary propagation under a convexity assumption. The procedure comprises multi-scale decomposition, blockwise detail enhancement, and variance-weighted fusion.
- Multi-Scale Block Decomposition: For a discrete set of tile sizes (e.g., , , ), both and are partitioned into non-overlapping blocks at each scale , yielding and .
- Blockwise Range Mapping and Detail Enhancement: Each block of is linearly normalized to , then -corrected ():
The result is then mapped to the dynamic range of the corresponding block, ensuring the fused result maintains the coarse statistics of the implicit azimuth while injecting localized details.
- Variance-weighted Fusion: The sample variance of over the object mask is computed. Weights are set as
The fused implicit azimuth is:
This fusion leverages the most informative spatial scale(s) per region.
- Prior Normals: With per-pixel zenith angle , the fused implicit surface normal is:
- Estimated Normals from Height Field: For unknown height , finite differences yield
- Convexity-Prior Energy Term: With pixel-wise weights (unity at boundaries, exponentially decaying inward),
This penalizes normal deviations from the multi-scale convexity prior.
2. Integration Within the SfP Energy Minimization
The overall 3D reconstruction is cast as a linear least-squares problem in the height field , with fixed albedo , refractive index , view , and lighting . The energy includes:
- Azimuth constraint
- Zenith/intensity constraint
- MFCP (convexity prior)
- Laplacian smoothness
The objective function is
where is the concatenated system, terms are added as extra rows weighted by , and sparse QR is employed for solution.
3. Rationale for Multi-Scale Fusion
Single-scale global convexity priors derived from object masks fail to resolve local detail and introduce quantization, undermining fine-structure recovery and introducing spatial artifacts. The multi-scale MFCP approach addresses these shortcomings:
- Large tile sizes capture coarse, globally convex azimuth trends.
- Small tiles inject local textural and structural details.
- Variance-based weighting prioritizes scales where the observed azimuth is most informative, enabling automatic adaptation to region complexity.
- -correction accentuates subtle variation in prior to fusion, preserving feature contrast while remapping to the physical range of for plausibility.
4. Algorithmic Procedure and Segment Interaction
MFCP operates within each polarization-aided adaptive region growing (PARG) segment:
- Input: per-pixel , implicit , block sizes .
- For each :
- Tile , .
- Normalize and apply -correction, remap to 's range.
- Compute variance .
- Compute weights .
- Fuse to obtain .
- Convert to normals .
- Add rows to the matrix system with per-pixel weights .
PARG segments, obtained via polarization-space region growing, enable imposition of a local convexity assumption by construction. Convexity weights are strongest at segment boundaries and decay inward. Independent reconstruction of each segment, followed by guided-filter stitching of local height fields, preserves both intra-segment consistency and global surface continuity.
5. Implementation Details
Key parameters and choices demonstrated in (Zhang et al., 8 Jan 2026):
- Initial values: , , , estimated per [14].
- MFCP .
- Block sizes: or , to balance coverage with detail.
- Finite-difference gradients: central with Gaussian smoothing; forward/backward at edges.
- Segmentation: 5x5 local windows for feature variance, queue-based growing, hole filling, and boundary smoothing.
- Optimization: QR decomposition of the sparse normal equations.
- Iterative update: alternate solving for and updating , , and until or 10 iterations.
- Typical runtime: several seconds per pixels on a modern CPU.
6. Quantitative and Qualitative Impact
Experimental results demonstrate substantial improvements attributable to MFCP within SMSfP. On Dataset A, mean angular error declines from 25.20° (with global convexity prior) to 16.99°; on Dataset B from 20.87° to 13.69°. Pixel accuracy at error increases from 25.33% to 47.56% (A) and from 42.54% to 59.83% (B). Qualitative evaluation shows MFCP substantially suppresses large coherent error regions, enhances local normal consistency, and recovers fine structural substrate, e.g., in textures such as car grilles and animal fur (Zhang et al., 8 Jan 2026).
7. Significance and Contextualization
MFCP generalizes the convexity prior in SfP from a rigid global constraint to an adaptive, multi-scale, and segment-wise regularizer. By fusing information across spatial scales and weighting according to observed variance, it circumvents fundamental limits of polarization ambiguity and coarse mask-based propagation. Coupled with PARG segmentation, MFCP enables a physically principled, locally consistent, and detail-preserving shape reconstruction pipeline, providing empirical gains in both quantitative metrics and visual fidelity compared with existing monocular, physics-based SfP methods (Zhang et al., 8 Jan 2026).