Compact Geometric Features (CGF)

Updated 27 February 2026

Compact Geometric Features (CGF) are learned descriptors that encapsulate local 3D geometry from unstructured point clouds using spherical histograms.
They use a deep multilayer perceptron to map high-dimensional histograms into compact Euclidean spaces, optimized with a triplet loss for discriminative power.
CGF achieves high precision, compactness, and fast query times compared to traditional hand-crafted descriptors, making it ideal for robotics and 3D vision applications.

Compact Geometric Features (CGF) are learned representations that encapsulate the local geometry around a point in an unstructured point cloud. CGF is designed to facilitate geometric registration tasks central to robotics and 3D vision, where matching and aligning scans from different viewpoints or temporally separated acquisitions is essential. Unlike prior hand-crafted descriptors, CGF achieves precision, compactness, and robustness through a pipeline that maps high-dimensional spherical histograms of point neighborhoods into low-dimensional Euclidean feature spaces using deep neural networks, specifically a multi-layer perceptron (MLP) trained with a triplet embedding loss (Khoury et al., 2017).

1. Local Spherical Histogram Construction

Given an unstructured point cloud $P\subset\mathbb R^3$ and a central point $p\in P$ , the method defines a spherical support region $S(p)=\{x:\|x-p\|\le r\}$ of radius $r$ . A local reference frame at $p$ is estimated by computing the normal $n_p$ (e.g., by principal component analysis on a small neighborhood) and two orthogonal tangent vectors, ensuring a right-handed coordinate system.

Each neighbor $q\in P\cap S(p)$ is converted to spherical coordinates $(\rho, \theta, \phi)$ , where $\rho = \|q-p\|$ , $\theta$ is elevation, and $\phi$ is azimuth. The spherical region is discretized into:

$R$ radial bins: thresholds are logarithmically spaced between $r_{\min}$ and $r$ ,

$r_i = \exp\bigl(\ln r_{\min} + \tfrac{i}{R}\ln\tfrac{r}{r_{\min}}\bigr),\quad i=0,\dots,R$

$E$ elevation bins: uniform, each of extent $\pi / E$
$A$ azimuth bins: uniform, each of extent $(2\pi)/A$

With these subdivisions, the total number of bins is $N = R \times E \times A$ . A normalized histogram $h_p\in\mathbb R^N$ is built for each $p$ ,

$\left[h_p\right]_{u,v,w} = \frac{1}{|N_p|} \sum_{q\in N_p} \mathcal B_{u,v,w}(q)$

where $\mathcal B_{u,v,w}(q)$ indicates the bin assignment, and $N_p$ is the set of points in support (excluding $p$ itself). This histogram serves as the high-dimensional representation of the local geometric context.

2. Deep Nonlinear Embedding of Histograms

To achieve compactness, the constructed histogram $h_p$ is mapped to a much lower-dimensional Euclidean space $\mathbb R^D$ via a learned nonlinear embedding $f: \mathbb R^N\to\mathbb R^D$ , where typically $D\ll N$ . The embedding is implemented as a fully-connected MLP with the following architecture:

Input layer: size $N$
Five hidden layers: each with 512 ReLU units
Output layer: size $D$ , linear activation

The forward operation is

$z^{(\ell+1)} = \max(0, W^{(\ell)}z^{(\ell)} + b^{(\ell)}),\quad \ell=0,\dots,4$

and the output: $f(h_p) = W^{(5)}z^{(5)} + b^{(5)} \in \mathbb R^D$ Feature dimensionality ( $D$ ) is a tunable parameter; in practice, values such as 12, 32, and 64 are used to balance compactness versus discriminative power. The resulting features are denoted CGF- $D$ .

3. Triplet Loss Metric Learning

The MLP parameters are optimized using a margin-based triplet loss. Training data consist of triplets $(x^a, x^p, x^n)$ , with $x^a$ (anchor) and $x^p$ (positive) being histograms from true correspondences (distance at most $\tau$ under ground-truth alignment) and $x^n$ (negative) from non-correspondences (distance in $[\tau,2\tau]$ ). The objective is: $\mathcal L(\Theta)=\frac{1}{|\mathcal T|}\sum_{(x^a,x^p,x^n)\in\mathcal T} \left[ \|f(x^a;\Theta)-f(x^p;\Theta)\|^2 -\|f(x^a;\Theta)-f(x^n;\Theta)\|^2 +1 \right]_+$ where $[\,\cdot\,]_+ = \max(0, \cdot)$ , and $\Theta$ are network parameters. Optimization is performed using Adam, learning rate $10^{-4}$ , minibatch size 512, and 3 epochs.

4. Precision, Compactness, and Robustness

The descriptor family achieves high discriminativeness due to the triplet metric learning, compactness (e.g., CGF-32 outperforms descriptors ranging from 33 to 1,980 dimensions), and robustness to real-world challenges such as noise or missing data, owing to training on diverse real and synthetic scans.

Evaluation follows these protocols:

Precision: Given two overlapping scans $P_i, P_j$ and ground-truth transforms $T_i, T_j$ , feature-based correspondences $C_f$ are formed by nearest-neighbor search in feature space. Pairs with ground-truth distance exceeding a threshold are discarded. Precision at threshold $x$ is given by

$\mathrm{precision}_f(x) = \frac{|\{(p,q)\in C'_f:\|T_ip-T_jq\|\le x\}|}{|C'_f|}$

with typical values $x=1\%$ of model diameter (laser scans) or $x=10$ cm (indoor scenes).

Registration metrics: RMSE of estimated versus ground-truth transformation, and recall (fraction of aligned fragment pairs within a distance threshold).

5. Quantitative Comparison with Hand-Crafted Descriptors

Empirical results demonstrate the superiority of CGF descriptors over several hand-crafted alternatives on independent test sets. Key comparisons:

Descriptor	Dimensionality	Laser precision@1%	SceneNN precision@10cm	Laser query (ms)	SceneNN query (ms)
CGF-32	32	41.4 %	50.6 %	0.42	0.10
Spin Images	153	32.2 %	8.2 %	1.62	0.25
FPFH	33	28.1 %	20.7 %	0.04	0.02
PFH	125	24.5 %	22.1 %	-	-
RoPS	135	23.0 %	22.7 %	-	-
SHOT	352	22.5 %	20.2 %	-	-
USC	1980	21.7 %	29.8 %	31.6	6.75

On the Redwood registration benchmark (no fine-tuning):

Method	Recall (%)	Precision (%)
FGR + FPFH	51.1	23.2
CZK + FPFH	59.2	19.6
3DMatch (volumetric)	65.1	25.2
FGR + CGF-32	60.7	9.4
CZK + CGF-32	72.0	14.6

CGF achieves higher precision at substantially lower dimensionality and query time than existing descriptors.

6. Implementation and Practicalities

Histogram resolution: $R=17$ (radial) × $E=11$ (elevation) × $A=12$ (azimuthal); $N=2,244$
Sphere radius: approximately 17% of model diameter for laser data, or 1.2 m for SceneNN; $r_{\min}$ about 1.5% of diameter or 0.1 m
Local reference frame estimated using support sized at 2% of diameter or 0.25 m
Feature size ( $D$ ): typically 12–64, with 32 optimal for precision-compactness trade-off
Embedding network: 5 layers × 512 ReLUs, total inference cost is 5 matrix multiplications plus 5 ReLU applications
Correspondence search uses $k$ -d trees (FLANN) in $\mathbb R^D$ , incurring $O(n\log n)$ query complexity per fragment

7. Limitations and Future Directions

The CGF paradigm requires training on representative overlapping scans with known alignments. Highly regular or repetitive local geometry can present failure cases due to ambiguity in the learned embedding. While performance is robust to common occlusion and noise, extreme or adversarial aliasing remains challenging. Possible future work includes exploring alternative deep metric learning objectives (lifted-structure loss, N-pair loss) or architectural variants such as residual MLPs or attention mechanisms, with the aims of further improving discriminative power and compactness of the learned feature space (Khoury et al., 2017).

Markdown Report Issue Upgrade to Chat

References (1)

Learning Compact Geometric Features (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Compact Geometric Features (CGF).

Compact Geometric Features (CGF)

1. Local Spherical Histogram Construction

2. Deep Nonlinear Embedding of Histograms

3. Triplet Loss Metric Learning

4. Precision, Compactness, and Robustness

5. Quantitative Comparison with Hand-Crafted Descriptors

6. Implementation and Practicalities

7. Limitations and Future Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Compact Geometric Features (CGF)

1. Local Spherical Histogram Construction

2. Deep Nonlinear Embedding of Histograms

3. Triplet Loss Metric Learning

4. Precision, Compactness, and Robustness

5. Quantitative Comparison with Hand-Crafted Descriptors

6. Implementation and Practicalities

7. Limitations and Future Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research