Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multi-Scale Fusion Convexity Prior (MFCP)

Updated 15 January 2026
  • The paper introduces MFCP as a multi-scale fusion strategy that fuses azimuth information to enhance surface normal estimation in shape-from-polarization.
  • MFCP employs multi-scale block decomposition, gamma-correction, and variance-weighted fusion to balance global structure with local detail.
  • Experimental results show significant accuracy improvements, with mean angular error reductions and increased pixel accuracy on benchmark datasets.

The Multi-Scale Fusion Convexity Prior (MFCP) is a physically motivated soft constraint on local surface convexity, designed to enhance monocular shape-from-polarization (SfP) by fusing azimuth angle information at multiple spatial scales within locally convex regions. MFCP forms a central component of the Segmentation-Driven Monocular SfP (SMSfP) framework, addressing the intrinsic azimuth ambiguity of polarization analysis and yielding improved surface normal recovery that preserves both global structure and fine-scale texture (Zhang et al., 8 Jan 2026).

1. Mathematical Definition and Formulation

MFCP operates on two per-pixel azimuth angle fields: φ(x,y)\varphi(x, y) estimated directly from polarization, and φim(x,y)\varphi_\mathrm{im}(x, y) obtained by convex-boundary propagation under a convexity assumption. The procedure comprises multi-scale decomposition, blockwise detail enhancement, and variance-weighted fusion.

  1. Multi-Scale Block Decomposition: For a discrete set of tile sizes S={A,B,C,}S = \{A, B, C, \ldots\} (e.g., A=32A=32, B=16B=16, C=8C=8), both φ\varphi and φim\varphi_\mathrm{im} are partitioned into non-overlapping i×ii \times i blocks at each scale iSi \in S, yielding φi\varphi_i and φi,im\varphi_{i,\mathrm{im}}.
  2. Blockwise Range Mapping and Detail Enhancement: Each block of φi\varphi_i is linearly normalized to [0,1][0,1], then γ\gamma-corrected (γ=0.5\gamma=0.5):

φi,gam(p)=(Norm(φi(p)))γ\varphi_{i,\mathrm{gam}}(p) = (\mathrm{Norm}(\varphi_i(p)))^\gamma

The result is then mapped to the dynamic range of the corresponding φi,im\varphi_{i, \mathrm{im}} block, ensuring the fused result maintains the coarse statistics of the implicit azimuth while injecting localized details.

  1. Variance-weighted Fusion: The sample variance σi2\sigma_i^2 of φi\varphi_i over the object mask is computed. Weights ωi\omega_i are set as

ωi=σi2/jSσj2\omega_i = \sigma_i^2 / \sum_{j \in S} \sigma_j^2

The fused implicit azimuth is:

φimout(x,y)=iSωiφi,im(x,y)\varphi_\mathrm{im}^\mathrm{out}(x, y) = \sum_{i \in S} \omega_i \cdot \varphi_{i, \mathrm{im}}(x, y)

This fusion leverages the most informative spatial scale(s) per region.

  1. Prior Normals: With per-pixel zenith angle θ(x,y)\theta(x,y), the fused implicit surface normal is:

nim(x,y)=[sinθcosφimout,sinθsinφimout,cosθ]Tn_\mathrm{im}(x, y) = \left[ \sin \theta \cos \varphi_\mathrm{im}^\mathrm{out}, \sin \theta \sin \varphi_\mathrm{im}^\mathrm{out}, \cos \theta \right]^T

  1. Estimated Normals from Height Field: For unknown height z(x,y)z(x,y), finite differences Dx,DyD_x, D_y yield

nest(x,y)=[Dxzcosθ,Dyzcosθ,cosθ]Tn_\mathrm{est}(x, y) = \left[ -D_x z \cdot \cos \theta, -D_y z \cdot \cos \theta, \cos \theta \right]^T

  1. Convexity-Prior Energy Term: With pixel-wise weights ωcon(x,y)\omega_\mathrm{con}(x,y) (unity at boundaries, exponentially decaying inward),

Econvex(z)=x,yωcon2(x,y)nim(x,y)nest(x,y)2E_\mathrm{convex}(z) = \sum_{x,y} \omega_\mathrm{con}^2(x, y) \| n_\mathrm{im}(x, y) - n_\mathrm{est}(x, y) \|^2

This penalizes normal deviations from the multi-scale convexity prior.

2. Integration Within the SfP Energy Minimization

The overall 3D reconstruction is cast as a linear least-squares problem in the height field zz, with fixed albedo α\alpha, refractive index η\eta, view vv, and lighting ll. The energy includes:

  • Azimuth constraint
  • Zenith/intensity constraint
  • MFCP (convexity prior)
  • Laplacian smoothness

The objective function is

Etotal(z)=ADzb2+λconvEconvex(z)+λlapElap(z)E_\mathrm{total}(z) = \|A D z - b\|^2 + \lambda_\mathrm{conv} E_\mathrm{convex}(z) + \lambda_\mathrm{lap} E_\mathrm{lap}(z)

where ADzbA D z \approx b is the concatenated system, EconvexE_\mathrm{convex} terms are added as extra rows weighted by ωcon\omega_\mathrm{con}, and sparse QR is employed for solution.

3. Rationale for Multi-Scale Fusion

Single-scale global convexity priors derived from object masks fail to resolve local detail and introduce quantization, undermining fine-structure recovery and introducing spatial artifacts. The multi-scale MFCP approach addresses these shortcomings:

  • Large tile sizes capture coarse, globally convex azimuth trends.
  • Small tiles inject local textural and structural details.
  • Variance-based weighting prioritizes scales where the observed azimuth is most informative, enabling automatic adaptation to region complexity.
  • γ\gamma-correction accentuates subtle variation in φ\varphi prior to fusion, preserving feature contrast while remapping to the physical range of φim\varphi_\mathrm{im} for plausibility.

4. Algorithmic Procedure and Segment Interaction

MFCP operates within each polarization-aided adaptive region growing (PARG) segment:

  1. Input: per-pixel φ\varphi, implicit φim\varphi_\mathrm{im}, block sizes SS.
  2. For each iSi \in S:
    • Tile φφi\varphi \rightarrow \varphi_i, φimφi,im\varphi_\mathrm{im} \rightarrow \varphi_{i,\mathrm{im}}.
    • Normalize and apply γ\gamma-correction, remap to φi,im\varphi_{i,\mathrm{im}}'s range.
    • Compute variance σi2\sigma_i^2.
  3. Compute weights ωi\omega_i.
  4. Fuse to obtain φimout\varphi_\mathrm{im}^\mathrm{out}.
  5. Convert to normals nimn_\mathrm{im}.
  6. Add EconvexE_\mathrm{convex} rows to the matrix system with per-pixel weights ωcon\omega_\mathrm{con}.

PARG segments, obtained via polarization-space region growing, enable imposition of a local convexity assumption by construction. Convexity weights are strongest at segment boundaries and decay inward. Independent reconstruction of each segment, followed by guided-filter stitching of local height fields, preserves both intra-segment consistency and global surface continuity.

5. Implementation Details

Key parameters and choices demonstrated in (Zhang et al., 8 Jan 2026):

  • Initial values: α=0.8\alpha=0.8, η=1.15\eta=1.15, v=[0,0,1]v=[0,0,1], ll estimated per [14].
  • MFCP γ=0.5\gamma=0.5.
  • Block sizes: S={32,16,8}S = \{32, 16, 8\} or {16,8,4}\{16, 8, 4\}, to balance coverage with detail.
  • Finite-difference gradients: central with Gaussian smoothing; forward/backward at edges.
  • Segmentation: 5x5 local windows for feature variance, queue-based growing, hole filling, and boundary smoothing.
  • Optimization: QR decomposition of the sparse normal equations.
  • Iterative update: alternate solving for zz and updating α\alpha, η\eta, and θ\theta until Δz<103\Delta z < 10^{-3} or 10 iterations.
  • Typical runtime: several seconds per 10510^5 pixels on a modern CPU.

6. Quantitative and Qualitative Impact

Experimental results demonstrate substantial improvements attributable to MFCP within SMSfP. On Dataset A, mean angular error declines from 25.20° (with global convexity prior) to 16.99°; on Dataset B from 20.87° to 13.69°. Pixel accuracy at <11.25< 11.25^\circ error increases from 25.33% to 47.56% (A) and from 42.54% to 59.83% (B). Qualitative evaluation shows MFCP substantially suppresses large coherent error regions, enhances local normal consistency, and recovers fine structural substrate, e.g., in textures such as car grilles and animal fur (Zhang et al., 8 Jan 2026).

7. Significance and Contextualization

MFCP generalizes the convexity prior in SfP from a rigid global constraint to an adaptive, multi-scale, and segment-wise regularizer. By fusing information across spatial scales and weighting according to observed variance, it circumvents fundamental limits of polarization ambiguity and coarse mask-based propagation. Coupled with PARG segmentation, MFCP enables a physically principled, locally consistent, and detail-preserving shape reconstruction pipeline, providing empirical gains in both quantitative metrics and visual fidelity compared with existing monocular, physics-based SfP methods (Zhang et al., 8 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Multi-Scale Fusion Convexity Prior (MFCP).