Depth Normal Consistency in 3D Reconstruction

Updated 1 December 2025

Depth Normal Consistency (DNC) is a technique that measures the angular difference between sensor-derived and predicted normals to filter out unreliable geometric cues.
It is applied in adaptive Gaussian splatting pipelines to improve mesh accuracy and photorealism by selectively trusting reliable depth and normal guidance.
Empirical results demonstrate that using DNC significantly boosts mesh F-score and rendering quality, making it vital for robust indoor 3D reconstruction.

Depth Normal Consistency (DNC) is a regularization and filtering strategy employed during Gaussian Splatting-based 3D reconstruction to robustly integrate geometric priors, especially when combining noisy or low-resolution sensor depth with data-driven or monocular normal estimates. DNC measures the agreement between depth- and normal-derived surface orientation at each pixel, and adaptively filters unreliable geometric supervision in regions where these modalities disagree. The result is a more reliable geometric alignment, improved mesh accuracy, and better photorealistic rendering in challenging scenarios such as smartphone-based indoor reconstruction.

1. Concept and Definition

Depth Normal Consistency (DNC) quantifies the alignment between the local surface normal computed from depth data, $N_d(p)$ , and an external normal estimate, $N_p(p)$ (typically from a monocular normal predictor). The consistency at pixel $p$ is measured by

$\theta_d(p) = \arccos \left( \frac{N_d \cdot N_p}{\|N_d\| \|N_p\|} \right)$

where $\theta_d(p)$ is the angular deviation in radians or degrees between the two normals. High values of $\theta_d$ indicate disagreement and thus potential unreliability in the raw depth or normal estimate at that location. This metric is used during supervision to selectively trust or disregard depth guides, thereby controlling the influence of external priors in the learning or optimization loop (Ren et al., 2024).

2. Role in Adaptive Gaussian Splatting Pipelines

In the AGS-Mesh framework, DNC is integral to the training phase of Gaussian Splatting models intended for mesh extraction and novel view synthesis:

Sensor depth from smartphone LiDAR or similar sources is noisy and not always geometrically consistent with other cues.
Monocular normals predicted by pretrained networks (e.g., Omnidata, ZoeDepth) provide high-resolution but possibly biased normal estimates.

DNC operates by first computing $N_d(p)$ using local plane fitting or K-NN covariance on the raw sensor depth image at each pixel. The orientation agreement with $N_p(p)$ is then measured using the above angular metric. If $\theta_d(p)$ exceeds a threshold $\tau_d$ (e.g., $10^\circ$ ), the corresponding depth is considered unreliable and suppressed for that pixel: $D_f(p) = \begin{cases} 0 & \text{if } \theta_d(p) > \tau_d \ D_s(p) & \text{otherwise} \end{cases}$ where $D_s(p)$ is the original sensor depth and $D_f(p)$ is the filtered version (Ren et al., 2024).

During optimization, the loss used to supervise the model geometry is then: $L_D = \begin{cases} \| \hat D - D_s \|_1 & \text{for steps } < T_d \ \| \hat D - D_f \|_1 & \text{for steps } \geq T_d \end{cases}$ where $\hat D$ is the model's rendered depth, and $T_d$ is a transition iteration.

3. Integration with Normal Regularization and Filtering

AGs-Mesh further extends the DNC idea to normal supervision. The normal consistency between the model's rendered normal, $\hat N(p)$ , and $N_p(p)$ is computed: $\theta_n(p) = \arccos \left( \frac{\hat N \cdot N_p}{\|\hat N\| \|N_p\|} \right)$ Filtering is performed analogously: $N_f(p) = \begin{cases} 0 & \text{if } \theta_n(p) > \tau_n \ N_p(p) & \text{otherwise} \end{cases}$ With this adaptive filtering, the model's normal supervision switches over iterations from using the full prior to only those pixels where consistency is high: $L_N = \begin{cases} \|\hat N - N_p\|_1 & \text{for steps } < T_n \ \|\hat N - N_f\|_1 & \text{for steps } \geq T_n \end{cases}$

This dual adaptation ensures that both depth and normal priors are enforced only in plausible, reliably reconstructed regions, and that ambiguous or outlier regions do not bias the optimization (Ren et al., 2024).

4. Optimization Objective with DNC

The full objective function for AGS-Mesh employing DNC is: $L = L_\text{color} + \lambda_d L_D + \lambda_n L_N$ where $L_\text{color}$ is a photometric loss (e.g., $L_1(\text{RGB}) +$ D-SSIM), and $\lambda_d=0.2$ , $\lambda_n=0.1$ are empirically chosen weights.

This objective enables robust learning of Gaussian splatting models that are well-aligned with real surfaces, suppressing noisy or inconsistent guidance sources, and is key for producing high-fidelity geometry from uncontrolled sensor input in indoor environments (Ren et al., 2024).

Empirical results on the MuSHRoom and ScanNet++ datasets confirm that DNC-based adaptive filtering leads to:

Significant improvements in mesh accuracy (F-score, Chamfer-L1, normal consistency) over both vanilla and prior-augmented pipelines.
Superior photorealistic rendering (PSNR, SSIM) of novel synthetic views.
Ability to leverage both low-resolution sensor-based depth and high-resolution monocular normals simultaneously, extracting their complementary strengths and mitigating individual weaknesses.

An ablation on the MuSHRoom dataset demonstrates that introducing both priors and DNC increases mesh F-score from 0.6039 (no priors) up to 0.9061 (with both and DNC), and further to 0.9157 with full adaptive filtering and multiscale meshing (Ren et al., 2024).

6. Generalization and Limitations

Depth Normal Consistency as a gating mechanism is not limited to AGS-Mesh but can generalize to other 3DGS and 2DGS pipelines requiring robust geometric supervision under multi-source uncertainty. Its main limitation is that, in regions where both priors (sensor depth and predicted normal) fail or disagree due to extreme noise or occlusion, it provides no strong supervision—thus, the final reconstruction quality still relies on the coverage and base quality of the priors.

A plausible implication is that DNC enables stable learning in real-world scenarios where geometric priors are essential but inherently noisy, such as in consumer device-based room scanning and real-time scene updating (Ren et al., 2024).

Markdown Upgrade to Chat

References (1)

AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using Smartphones (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Depth Normal Consistency (DNC).

Depth Normal Consistency in 3D Reconstruction

1. Concept and Definition

2. Role in Adaptive Gaussian Splatting Pipelines

3. Integration with Normal Regularization and Filtering

4. Optimization Objective with DNC

5. Empirical Impact and Mesh Refinement

6. Generalization and Limitations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Depth Normal Consistency in 3D Reconstruction

1. Concept and Definition

2. Role in Adaptive Gaussian Splatting Pipelines

3. Integration with Normal Regularization and Filtering

4. Optimization Objective with DNC

5. Empirical Impact and Mesh Refinement

6. Generalization and Limitations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics