Papers
Topics
Authors
Recent
Search
2000 character limit reached

Point Map Normal Metric

Updated 4 July 2026
  • Point map normal metric is a dense evaluation measure that quantifies local surface orientation accuracy from predicted 3D point maps.
  • It computes the mean angular error between predicted and reference normals using finite differences and cross products, highlighting local surface irregularities.
  • This metric complements pointwise errors by detecting high-frequency surface defects and driving improvements through point gradient matching losses.

The point map normal metric is an evaluation measure for point-map-based 3D reconstruction that compares the local surface orientation induced by neighboring predicted 3D points to the corresponding orientation in a reference point map. In the monocular-geometry setting, it was introduced to address a limitation of standard pointwise metrics: global and local point accuracy can remain nearly unchanged under perturbations of similar magnitude even when one perturbation produces strongly incoherent local surfaces and visible geometric artifacts (Knaebel et al., 29 May 2026). In adjacent reconstruction work, dense point maps are likewise treated as metric geometric fields from which normals can be derived, while in other mathematical literatures the phrase “normal metric” refers to unrelated constructions such as metrics with normal structure or intrinsic path metrics on polyhedral surfaces (Zhou et al., 27 Nov 2025, Nabiei, 2016, Wang, 2018).

1. Point maps and induced local surface geometry

In the relevant 3D reconstruction literature, a point map is a dense image-aligned field

P^RH×W×3,\hat{\mathbf P} \in \mathbb{R}^{H \times W \times 3},

predicted from a single RGB image

IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},

so that each pixel (i,j)(i,j) is assigned a 3D point P^ij\hat{\mathbf P}_{ij} in camera coordinates. In SurGe, the decoder predicts per-pixel parameters (ξ,η,ρ)(\xi,\eta,\rho), which are mapped to 3D coordinates as

(x,y,z)=(ξeρ, ηeρ, eρ).(x,y,z) = (\xi e^\rho,\ \eta e^\rho,\ e^\rho).

Here z=eρz=e^\rho is a shared scale factor for all coordinates, and the representation is aligned with the affine-invariant geometry used in DUSt3R and MoGe (Knaebel et al., 29 May 2026).

Local surface geometry is induced directly by neighboring 3D differences in the point map. The paper uses horizontal and vertical finite differences,

ΔxPij=Pi,j+1Pij,ΔyPij=Pi+1,jPij,\Delta_x \mathbf P_{ij} = \mathbf P_{i,j+1} - \mathbf P_{ij}, \qquad \Delta_y \mathbf P_{ij} = \mathbf P_{i+1,j} - \mathbf P_{ij},

and interprets local geometry through how these displacements combine. In this formulation, orientation is the direction of the local surface normal, while smoothness is reflected in how these normals vary spatially; high-frequency orientation changes correspond to ripples, blockiness, or bent local patches (Knaebel et al., 29 May 2026).

A closely related construction appears in Splat-SAP, where each image pixel is associated with a 3D point in a scale-aware point map Xti(u,v)R3X_t^i(u,v)\in\mathbb{R}^3. That work does not define a normal-based loss, but it explicitly notes that because XtiX_t^i provides a dense 3D point for each pixel, surface normals can be derived at any pixel by local finite differences or local plane fitting. This places point maps in a broader category of dense metric geometric fields from which normal-based evaluation can be built (Zhou et al., 27 Nov 2025).

2. Formal definition of the metric

The point map normal metric is defined from local normals computed around each valid pixel. The evaluation procedure first forms four local normals around each pixel from cross products of adjacent point map differences. The valid local normals are then averaged, producing a ground-truth normal map IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},0 and a predicted normal map IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},1. The valid set IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},2 contains pixels whose annotated neighborhood defines at least one valid local normal (Knaebel et al., 29 May 2026).

The metric itself is the point map normal mean angular error: IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},3 where the angle between unit vectors is measured in degrees as

IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},4

All reported values are lower is better, measured in degrees (Knaebel et al., 29 May 2026).

The metric can be evaluated in different masking regimes. The paper reports it on instance regions for datasets with instance masks and globally for datasets with dense ground truth that supports reliable normal estimation. The computation is entirely local and linear in the number of pixels, since each pixel uses only a small neighborhood and a small number of cross products (Knaebel et al., 29 May 2026).

Two properties are central. First, the metric is derived from cross products of neighboring differences, so it measures the orientation of the induced tangent plane rather than pointwise Euclidean displacement. Second, because it is normal-based, it is comparatively insensitive to uniform translation and more sensitive to local curvature, ripples, and bending than pointwise distance metrics (Knaebel et al., 29 May 2026).

3. Relation to pointwise point-map metrics

The point map normal metric was introduced against the background of pointwise 3D evaluation. A standard global metric is global point map AbsRel: after a global scale-and-translation alignment, one computes

IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},5

and averages over valid pixels. A local point map metric uses a similar pointwise construction but with separate alignment per instance mask or object. These metrics are pointwise: they penalize each 3D point independently after alignment and are therefore dominated by global placement errors (Knaebel et al., 29 May 2026).

The distinction between these families is structural:

Metric Alignment or correspondence Measured quantity
Global point map AbsRel One global scale-and-translation alignment Pointwise 3D position error magnitude
Local point map metric Separate alignment per instance or object Pointwise local 3D position error magnitude
Point map normal metric No separate pointwise distance term; normals from local neighborhoods Local surface orientation error

This difference matters because small-amplitude high-frequency perturbations can leave global or local AbsRel nearly unchanged while severely degrading local surface coherence. SurGe illustrates this explicitly: a low-frequency perturbation and a high-frequency perturbation of similar pointwise magnitude yield very similar global and local AbsRel, yet the high-frequency perturbation produces strongly incoherent local surfaces and much larger mean angular normal error; the reported example changes from roughly IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},6 to roughly IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},7 under the normal metric (Knaebel et al., 29 May 2026).

The paper therefore characterizes normals as a high-pass filter on geometry. Small positional errors at high spatial frequencies cause large changes in normals, while large smooth errors such as global tilt or uniform mis-scaling cause smaller normal changes. This makes the metric a more direct probe of local surface fidelity and perceived 3D realism, especially for thin structures and detail-sensitive regions such as lampposts, chair legs, wires, and faucets (Knaebel et al., 29 May 2026).

4. Training-time counterparts: point gradient matching and local feature mixing

Although the point map normal metric is an evaluation measure, SurGe introduces a training loss that is explicitly aligned with it: the point gradient matching loss IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},8. For a point map IRH×W×3,\mathbf I \in \mathbb{R}^{H \times W \times 3},9, the forward finite differences are depth-normalized using the nearer endpoint’s depth,

(i,j)(i,j)0

The loss then matches these normalized 3D gradients between prediction and ground truth: (i,j)(i,j)1 Because local normals are formed from cross products of neighboring differences, matching these 3D gradients tends to improve the same local geometry that the point map normal metric evaluates (Knaebel et al., 29 May 2026).

The associated architectural component is the Neighborhood Attention Decoder (NAD), designed to improve local surface detail. SurGe uses a DINOv2 ViT-Large encoder and a 5-stage upsampling decoder. Each stage contains Neighborhood Attention blocks with local windows of size (i,j)(i,j)2, QK normalization, RoPE with window-matched base frequency, and a feed-forward network of width (i,j)(i,j)3. The stated motivation is that local, content-aware feature mixing is better suited than fixed-kernel convolution to difficult local geometry such as thin structures against distant backgrounds (Knaebel et al., 29 May 2026).

The ablations are consistent with that design. Under the same global and local point losses, (i,j)(i,j)4 improves both local point-map metrics and normal MAE relative to no surface loss, a normal-only loss, and a log-depth gradient loss. Decoder ablations likewise report that NAD achieves the lowest normal MAE across all tested datasets, alongside the best local point map AbsRel and strong global point map AbsRel (Knaebel et al., 29 May 2026).

5. Benchmarks and empirical behavior

The point map normal metric is used in zero-shot monocular geometry evaluation on ETH3D, iBims-1, GSO, Sintel, and DIODE. On these benchmarks, SurGe reports the following point map normal mean angular errors, in degrees, with lower being better (Knaebel et al., 29 May 2026):

Dataset Best prior in table SurGe
ETH3D 19.5 (MoGe) 18.3
iBims-1 17.6 (MoGe) 16.5
GSO 10.6 (VGGT) 10.5
Sintel 25.2 (MoGe) 24.5
DIODE 12.4 (MoGe) 12.0

These results support two claims emphasized in the paper. First, the method consistently achieves the lowest normal MAE across the reported datasets. Second, the normal improvements are not obtained by sacrificing global geometry: the model also achieves the best average rank for global point map AbsRel across eight zero-shot monocular geometry benchmarks and consistently improves local point-map evaluation (Knaebel et al., 29 May 2026).

The reported improvements over MoGe and MoGe-2 are often on the order of 1–3 degrees in mean angular error. In the context of average normal error, the paper interprets this as substantial, particularly because the metric targets failure modes that are weakly reflected in pointwise geometry scores. Qualitative comparisons reinforce that interpretation: competing models show noisy or rippled local normal patterns on thin structures and complicated backgrounds, whereas SurGe produces smoother local orientation fields while preserving sharper geometric transitions at object boundaries (Knaebel et al., 29 May 2026).

The point map normal metric belongs to a broader family of normal-based geometry measures, but its precise setting is distinct. In Splat-SAP, the point map is first transformed from a canonical representation into a scale-aware point map in real space,

(i,j)(i,j)5

and the paper notes that normals can then be derived from local finite differences or local plane fitting. However, Splat-SAP does not itself define or optimize a normal metric; normals are presented as a derivable by-product of dense metric point maps (Zhou et al., 27 Nov 2025). This suggests that point map normal evaluation is naturally compatible with scale-aware point-map pipelines even when it is not explicitly built into the training objective.

A different but related construction appears in OCMG-Net for unstructured point clouds. There, the Chamfer Normal Distance (CND) compares predicted normals on a noisy point set to normals on the closest points of a clean reference set: (i,j)(i,j)6 CND borrows the nearest-neighbor matching principle of Chamfer Distance but measures angular normal discrepancy rather than position error. Unlike the dense image-aligned point map normal metric, CND is defined on unstructured point clouds with attached normals and is asymmetric, using correspondences from noisy to clean points only (Wu et al., 2024).

The terminology also has unrelated meanings in pure and geometric analysis. In one operator-theoretic setting, a paper defines a new metric on (i,j)(i,j)7 and proves that the resulting metric space has normal structure, enabling fixed-point theorems for families of non-expansive maps and for groups of h-biholomorphic automorphisms (Nabiei, 2016). In another setting, the farthest point map is studied on the surface of a centrally symmetric convex polyhedron equipped with the intrinsic path metric, described as the natural metric induced by the polyhedral surface (Wang, 2018). These usages share vocabulary with “point map normal metric” but not its reconstruction-specific meaning.

Within SurGe itself, several limitations and extensions are identified or implied. The metric is sensitive to noise and discretization because normals are estimated from finite differences and cross products; this is desirable for detecting local surface defects, but it can also amplify annotation noise. It depends on the quality and density of ground truth, and it measures only first-order geometry rather than curvature or higher-order smoothness. Suggested extensions include patch-wise plane fitting, curvature-based metrics, multi-scale normal consistency, and mask-aware evaluation that separates thin structures or edges from broad flat surfaces (Knaebel et al., 29 May 2026).

Taken together, these formulations place the point map normal metric in a specific niche: it is a dense, image-aligned, angular measure of local surface orientation fidelity. Its role is complementary to pointwise point-map errors, not a replacement for them. Pointwise metrics quantify how accurately points are placed after alignment; the point map normal metric quantifies whether those points form the right local surface.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Point Map Normal Metric.