Papers
Topics
Authors
Recent
2000 character limit reached

Point-Centric Convolution: 3D Feature Aggregation

Updated 5 December 2025
  • Point-Centric Convolution is a filtering operator focused on the direct, translation-invariant aggregation of point cloud data using local neighborhoods.
  • It employs learnable weight functions and geometric priors—via methods like KPConv, FPConv, and FKAConv—to manage irregular, discrete spatial data for precise 3D segmentation and classification.
  • Recent approaches integrate patch-based techniques and local reference frames to achieve rotation and scale invariance while enhancing computational efficiency.

A point-centric convolution is a convolution operator whose support and filtering are explicitly centered around the locations of input points in a point cloud, rather than on a gridded or quantized spatial substrate. This design principle addresses the irregular, discrete nature of point cloud data and enables direct, translation-invariant local aggregation, precise feature encoding, and geometric generalization. Point-centric convolution underpins several state-of-the-art architectures for 3D classification and segmentation, and manifests in a range of algorithmic implementations that exploit spatial localization, kernel flexibility, geometric priors, and continuous formulation.

1. Mathematical Foundations of Point-Centric Convolution

The canonical mathematical form for point-centric convolution can be expressed as an aggregation over a local neighborhood N(p)\mathcal N(p) around each reference (center) point pp: xp′=∑pj∈N(p)w(pj−p)Tx(pj)x'_p = \sum_{p_j \in \mathcal N(p)} w(p_j - p)^T x(p_j) where x(pj)x(p_j) is the input feature at neighbor pjp_j, and w(Δp)w(\Delta p) is a weight-generating function parameterized either by learnable parameters (e.g., MLPs, kernel points, shape priors) or geometric construction (Wu et al., 2022, Thomas et al., 2019, Huang et al., 2020).

Point-centric convolution generalizes traditional grid-based convolution by making the filter a function of relative coordinates Δp\Delta p. This property ensures translation invariance and, depending on the kernel design, may be extended to rotation and scale invariance (Zhang et al., 2020, Huang et al., 2020, Jin et al., 2019). The kernel itself may be:

2. Kernel Construction and Geometric Priors

Several families of point-centric convolution operators have emerged:

Operator Kernel Representation Spatial Localization
KPConv Explicit kernel points {pk}\{p_k\} Euclidean, deformable
FPConv Soft projection to 2D grid, weight map Flattened surface
HPC Geometric priors: point, line, plane, sphere Hausdorff shape match
FKAConv Geometry-less kernel weights, alignment Feature alignment MLP
GCAConv Anchors via global context, local frame Rotation-invariant bins
STPC Direction dictionary, anisotropic slots Learned directions
PointCNN++ Native point-centered bins (local voxels) Sparse local quantization

KPConv (Thomas et al., 2019) places learnable kernel points in local neighborhoods, with flexibility for deformability to adapt to intrinsic local geometry. Hausdorff Point Convolution (HPC) (Huang et al., 2020) replaces spatial kernels with compact geometric priors (e.g., sphere, plane) and computes shape-aware responses. FPConv (Lin et al., 2020) uses local flattening to enable 2D CNNs on local patches. FKAConv (Boulch et al., 2020) detaches kernel weights from explicit spatial locations, focusing on soft assignment and alignment. GCAConv (Zhang et al., 2020) builds a local reference frame using global statistics, achieving rotation-invariant filtering. STPC (Fang et al., 2020) learns a dictionary of latent spatial directions, enabling fully anisotropic response across unconstrained 3D neighborhoods.

3. Neighborhood Definition and Permutation Invariance

Neighborhood formation is central to point-centric convolution. Common strategies include:

Permutation invariance is typically achieved via symmetric aggregation functions (max-pool, sum), kernel designs lacking dependence on input order, or through basis expansion (extension–restriction in PCNN (Atzmon et al., 2018)) and frame consistency (NPTC-net (Jin et al., 2019)).

4. Computational Strategies and Operator Efficiency

Emerging point-centric convolutions increasingly prioritize computational efficiency and scalability:

  • PointCNN++ (Li et al., 28 Nov 2025) introduces a highly optimized Matrix-Vector Multiplication and Reduction (MVMR) primitive, enabling convolution over native points with minimal memory and runtime overhead.
  • SPConv (Li et al., 2021) uses hierarchical shell-based aggregation fused by 1D convolutions across shells, combined with Poisson Disk downsampling for efficiency.
  • FPConv (Lin et al., 2020) leverages learned local flattening and optimized 2D convolution for high-throughput surface analysis.
  • FKAConv (Boulch et al., 2020) employs a quasi-uniform spatial quantization for rapid subsampling, outperforming standard farthest-point sampling in speed while maintaining coverage.

End-to-end architectures commonly follow encoder–decoder (U-Net) or hierarchical residual block patterns, slotting point-centric convolution as the core local operator, sometimes interleaved with attention, feature propagation, and anisotropic filtering (Fang et al., 2020, Li et al., 2021, Lin et al., 2020, Li et al., 28 Nov 2025).

5. Geometric Invariance and Shape Awareness

Geometric invariance is a defining attribute of point-centric convolution:

Shape-awareness, as in HPC, is introduced by aggregating shortest distances between query and kernel sets, enabling enhanced semantic discrimination of planar, linear, or volumetric regions. FPConv exhibits specialization for flat surface patches, while KPConv (particularly deformable) adapts spatial kernels to complex local curvatures (Lin et al., 2020, Thomas et al., 2019).

6. Empirical Performance and Task-Specific Adaptation

Point-centric convolution operators have achieved state-of-the-art results across major benchmarks:

Method ModelNet40 (OA) S3DIS (mIoU) SemanticKITTI (mIoU)
KPConv (rigid) (Thomas et al., 2019) 92.9% 65.4% 58.8%
FPConv (Lin et al., 2020) 92.5% 62.8% –
HPC-DNN (multi-kernel) (Huang et al., 2020) – 68.2% 60.3%
SPNet (Li et al., 2021) – 69.9% –
FKAConv (Boulch et al., 2020) 92.5% 68.4% 74.6%
PointCNN++ (Reg. Recall KITTI) (Li et al., 28 Nov 2025) – – 99.8% (recall)
PointConvFormer (Wu et al., 2022) (ScanNet) – 74.5% 67.1%

Task adaptation is evident in the use of multi-kernel HPC for hierarchical encoding, fusions of FPConv and KPConv for curvature-specific regions, and local attention (SPNet, PointConvFormer) for fine-grained neighbor selection (Li et al., 2021, Wu et al., 2022). Synthesis of anisotropic and shape-aware responses has led to increased segmentation and registration accuracy.

7. Future Directions and Challenges

Key avenues for future research include:

  • Data-driven or differentiable kernel search (PointSeaConv, PointSeaNet (Nie et al., 2021)), aiming for joint optimization of convolution operator and network topology.
  • Enhanced geometric invariance, potentially by integrating non-rigid or transformation-equivariant descriptors (GCAConv, NPTC-net).
  • Efficient adaptation to large-scale, real-world point clouds with noise, partiality, and multi-modal attributes.
  • Further fusion of classic convolution (grid-based) and point-centric paradigms to balance geometric fidelity and throughput (PointCNN++) (Li et al., 28 Nov 2025).
  • Dynamic kernel generation, learned anchor placement, and integrated attention mechanisms for context-aware local aggregation.

References

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Point-Centric Convolution.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube