Papers
Topics
Authors
Recent
2000 character limit reached

Depth Normal Consistency in 3D Reconstruction

Updated 1 December 2025
  • Depth Normal Consistency (DNC) is a technique that measures the angular difference between sensor-derived and predicted normals to filter out unreliable geometric cues.
  • It is applied in adaptive Gaussian splatting pipelines to improve mesh accuracy and photorealism by selectively trusting reliable depth and normal guidance.
  • Empirical results demonstrate that using DNC significantly boosts mesh F-score and rendering quality, making it vital for robust indoor 3D reconstruction.

Depth Normal Consistency (DNC) is a regularization and filtering strategy employed during Gaussian Splatting-based 3D reconstruction to robustly integrate geometric priors, especially when combining noisy or low-resolution sensor depth with data-driven or monocular normal estimates. DNC measures the agreement between depth- and normal-derived surface orientation at each pixel, and adaptively filters unreliable geometric supervision in regions where these modalities disagree. The result is a more reliable geometric alignment, improved mesh accuracy, and better photorealistic rendering in challenging scenarios such as smartphone-based indoor reconstruction.

1. Concept and Definition

Depth Normal Consistency (DNC) quantifies the alignment between the local surface normal computed from depth data, Nd(p)N_d(p), and an external normal estimate, Np(p)N_p(p) (typically from a monocular normal predictor). The consistency at pixel pp is measured by

θd(p)=arccos(NdNpNdNp)\theta_d(p) = \arccos \left( \frac{N_d \cdot N_p}{\|N_d\| \|N_p\|} \right)

where θd(p)\theta_d(p) is the angular deviation in radians or degrees between the two normals. High values of θd\theta_d indicate disagreement and thus potential unreliability in the raw depth or normal estimate at that location. This metric is used during supervision to selectively trust or disregard depth guides, thereby controlling the influence of external priors in the learning or optimization loop (Ren et al., 2024).

2. Role in Adaptive Gaussian Splatting Pipelines

In the AGS-Mesh framework, DNC is integral to the training phase of Gaussian Splatting models intended for mesh extraction and novel view synthesis:

  • Sensor depth from smartphone LiDAR or similar sources is noisy and not always geometrically consistent with other cues.
  • Monocular normals predicted by pretrained networks (e.g., Omnidata, ZoeDepth) provide high-resolution but possibly biased normal estimates.

DNC operates by first computing Nd(p)N_d(p) using local plane fitting or K-NN covariance on the raw sensor depth image at each pixel. The orientation agreement with Np(p)N_p(p) is then measured using the above angular metric. If θd(p)\theta_d(p) exceeds a threshold τd\tau_d (e.g., 1010^\circ), the corresponding depth is considered unreliable and suppressed for that pixel: Df(p)={0if θd(p)>τd Ds(p)otherwiseD_f(p) = \begin{cases} 0 & \text{if } \theta_d(p) > \tau_d \ D_s(p) & \text{otherwise} \end{cases} where Ds(p)D_s(p) is the original sensor depth and Df(p)D_f(p) is the filtered version (Ren et al., 2024).

During optimization, the loss used to supervise the model geometry is then: LD={D^Ds1for steps <Td D^Df1for steps TdL_D = \begin{cases} \| \hat D - D_s \|_1 & \text{for steps } < T_d \ \| \hat D - D_f \|_1 & \text{for steps } \geq T_d \end{cases} where D^\hat D is the model's rendered depth, and TdT_d is a transition iteration.

3. Integration with Normal Regularization and Filtering

AGs-Mesh further extends the DNC idea to normal supervision. The normal consistency between the model's rendered normal, N^(p)\hat N(p), and Np(p)N_p(p) is computed: θn(p)=arccos(N^NpN^Np)\theta_n(p) = \arccos \left( \frac{\hat N \cdot N_p}{\|\hat N\| \|N_p\|} \right) Filtering is performed analogously: Nf(p)={0if θn(p)>τn Np(p)otherwiseN_f(p) = \begin{cases} 0 & \text{if } \theta_n(p) > \tau_n \ N_p(p) & \text{otherwise} \end{cases} With this adaptive filtering, the model's normal supervision switches over iterations from using the full prior to only those pixels where consistency is high: LN={N^Np1for steps <Tn N^Nf1for steps TnL_N = \begin{cases} \|\hat N - N_p\|_1 & \text{for steps } < T_n \ \|\hat N - N_f\|_1 & \text{for steps } \geq T_n \end{cases}

This dual adaptation ensures that both depth and normal priors are enforced only in plausible, reliably reconstructed regions, and that ambiguous or outlier regions do not bias the optimization (Ren et al., 2024).

4. Optimization Objective with DNC

The full objective function for AGS-Mesh employing DNC is: L=Lcolor+λdLD+λnLNL = L_\text{color} + \lambda_d L_D + \lambda_n L_N where LcolorL_\text{color} is a photometric loss (e.g., L1(RGB)+L_1(\text{RGB}) + D-SSIM), and λd=0.2\lambda_d=0.2, λn=0.1\lambda_n=0.1 are empirically chosen weights.

This objective enables robust learning of Gaussian splatting models that are well-aligned with real surfaces, suppressing noisy or inconsistent guidance sources, and is key for producing high-fidelity geometry from uncontrolled sensor input in indoor environments (Ren et al., 2024).

5. Empirical Impact and Mesh Refinement

Empirical results on the MuSHRoom and ScanNet++ datasets confirm that DNC-based adaptive filtering leads to:

  • Significant improvements in mesh accuracy (F-score, Chamfer-L1, normal consistency) over both vanilla and prior-augmented pipelines.
  • Superior photorealistic rendering (PSNR, SSIM) of novel synthetic views.
  • Ability to leverage both low-resolution sensor-based depth and high-resolution monocular normals simultaneously, extracting their complementary strengths and mitigating individual weaknesses.

An ablation on the MuSHRoom dataset demonstrates that introducing both priors and DNC increases mesh F-score from 0.6039 (no priors) up to 0.9061 (with both and DNC), and further to 0.9157 with full adaptive filtering and multiscale meshing (Ren et al., 2024).

6. Generalization and Limitations

Depth Normal Consistency as a gating mechanism is not limited to AGS-Mesh but can generalize to other 3DGS and 2DGS pipelines requiring robust geometric supervision under multi-source uncertainty. Its main limitation is that, in regions where both priors (sensor depth and predicted normal) fail or disagree due to extreme noise or occlusion, it provides no strong supervision—thus, the final reconstruction quality still relies on the coverage and base quality of the priors.

A plausible implication is that DNC enables stable learning in real-world scenarios where geometric priors are essential but inherently noisy, such as in consumer device-based room scanning and real-time scene updating (Ren et al., 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Depth Normal Consistency (DNC).