CALM-Net: Curvature-Aware Vehicle Re-ID
- CALM-Net is a curvature-aware multi-branch neural network that processes LiDAR point clouds for robust vehicle re-identification.
- It integrates edge convolution, point attention, and curvature embedding to extract complementary geometric, contextual, and surface variation features.
- Empirical evaluation on nuScenes data shows a 1.97% improvement in mean re-identification accuracy over strong baselines, supporting real-time autonomous applications.
CALM-Net refers to a curvature-aware LiDAR point cloud-based multi-branch neural network designed for vehicle re-identification in three-dimensional point cloud data. It integrates complementary geometric, contextual, and surface variation features through specialized architectural components—edge convolution, point attention, and curvature embedding—to enhance the discriminative power of deep representations for distinguishing vehicles in large-scale datasets such as nuScenes. Empirical studies demonstrate that CALM-Net achieves a roughly 1.97 percentage point improvement in mean re-identification accuracy over strong baseline architectures. The design highlights the value of explicitly encoding local surface curvature information in point cloud models for robust vehicle identity matching across varying views and sparsity regimes (Lee et al., 16 Oct 2025).
1. Multi-Branch Architecture for Point Cloud Representation
CALM-Net adopts a multi-branch architecture explicitly constructed to extract and aggregate discrete but complementary features from raw LiDAR point clouds:
- Edge Convolution (EC) Branch: Models local geometric context. For each point , its -nearest neighbors are identified. The edge feature is computed as:
Aggregation is performed via max pooling:
where are learned weights. This stream is sensitive to local topology and micro-structural differences.
- Point Attention (PA) Branch: Implements global contextual reasoning in the spirit of attention mechanisms found in Vision Transformers. Input features are linearly projected into queries (), keys (), and values ():
Attention is computed as:
and the contextualized output:
This branch enables modeling of long-range dependencies within the point cloud.
- Curvature Embedding Branch: Quantifies and encodes local surface variation. For each point, calculate the covariance matrix of its -nearest neighborhood:
Eigen decomposition yields which encapsulate local patch geometry. The embedding module is:
After computing features from each stream, the respective representations are concatenated and passed through subsequent convolutional and batch normalization layers:
$\mathrm{CALM\mbox{-}Net}(X) = \mathrm{ReLU}(B_2(X))$
where denotes concatenation.
2. Role and Implementation of Curvature Embedding
Curvature embedding is central to CALM-Net’s discriminative capacity. By moving beyond raw (x, y, z) coordinate processing, CALM-Net leverages the principal eigenvalues of neighborhood covariances to encode deviations from local planarity:
- Flat surfaces yield one large and two near-zero eigenvalues.
- Edges or ridges manifest as two significant and one small eigenvalue.
- Highly curved regions have three strong eigenvalues.
This spectral encoding via eigenvectors provides invariance to rotations/viewpoints and robustness to sparsity, allowing the network to distinguish vehicles with subtle geometric cues. The encoded curvature vector is mapped non-linearly to a learned feature space, yielding substantial gains in re-identification accuracy, especially among classes with similar gross shape but varying micro-structure.
3. Experimental Evaluation and Quantitative Results
CALM-Net was benchmarked on a nuScenes-derived vehicle re-identification dataset:
- Only annotated frames with at least 127 points each were considered.
- Both rigid (e.g., car, truck, bus, trailer) and deformable (e.g., motorcycle, pedestrian) object classes were evaluated using a pairwise matching protocol and metrics such as mean accuracy (mAcc), F1 positive, and F1 negative scores.
Key findings include:
| Method | Mean Acc. | F1 Pos. | F1 Neg. | Inference Time (256 pts, ms) |
|---|---|---|---|---|
| PointNet | 91.54 | 90.64 | 97.62 | 20–21 |
| PointNeXt | 94.91 | 94.12 | 98.00 | 27–29 |
| DGCNN | 92.41 | 91.18 | 97.41 | 58–59 |
| DeepGCN | 93.67 | 93.02 | 97.81 | 52–55 |
| Point Transformer | 94.16 | 93.49 | 98.65 | 29–32 |
| CALM-Net | 95.74 | 95.28 | 98.89 | 23–24 |
- Hybrid point subsampling (random during training, FPS at inference) was used for best accuracy.
- Rigid objects benefitted most from curvature embedding; performance on deformable classes remained lower.
Ablation studies confirmed that each architectural branch—EC, PA, and curvature embedding—contributed distinctly, with their combination yielding the highest accuracy.
4. Mathematical Formulation Details
The key mathematical operations of CALM-Net include:
- Covariance Eigenanalysis for Curvature:
- Edge Convolution:
- Point Attention:
- Aggregation:
$\mathrm{CALM\mbox{-}Net}(X) = \mathrm{ReLU}(B_2(X))$
5. Application Prospects and Implications
The design and empirical efficacy of CALM-Net indicate several directions for application and further research:
- Real-time Automotive Systems: CALM-Net operates at 23–24 ms/frame (256 points), enabling deployment in latency-sensitive autonomous driving and intelligent surveillance.
- Robust Multi-object Tracking: The integrated features support reliable association of vehicles under changing viewpoints, partial occlusions, and variable LiDAR returns, thus enhancing multi-camera/sensor tracking frameworks.
- 3D Geometric Reasoning: The explicit curvature branch provides a template for future 3D models requiring local surface analysis, with potential extensions for non-rigid/deformable object reasoning or fusion with camera/radar modalities.
- Improving Re-identification for Deformable Classes: Results suggest the need for specialized adaptations to achieve similar gains for motorcycles, bicycles, and pedestrians.
6. Comparison with Baseline Methods
| Model | Curvature Embedding | mAcc (%) | Relative Gain |
|---|---|---|---|
| PointNet | ✗ | 91.54 | – |
| PointNeXt | ✗ | 94.91 | – |
| DGCNN | ✗ | 92.41 | – |
| DeepGCN | ✗ | 93.67 | – |
| Point Transformer | ✗ | 94.16 | – |
| CALM-Net | ✓ | 95.74 | +1.97 |
These quantitative comparisons underscore that CALM-Net’s combination of multi-branch feature learning and explicit curvature encoding extracts discriminative and robust features not captured by prior architectures.
7. Future Directions
Extensions and open research avenues include:
- Refining the curvature embedding for better expressivity, possibly leveraging higher-order local statistics.
- Addressing disparities in performance for deformable versus rigid classes by integrating multi-scale encoding or adaptive modules.
- Exploring multimodal fusion (e.g., with RGB or radar) using the CALM-Net framework for unified scene understanding.
- Systematic exploration of architectural trade-offs between computational complexity and representational power for larger-scale deployment.
CALM-Net exemplifies the trend toward explicit geometric encoding merged with attention-based contextual processing in 3D vision, offering a robust foundation for next-generation vehicle re-identification and tracking in autonomous systems (Lee et al., 16 Oct 2025).