Relative Energy Learning in 3D LiDAR OOD
- REL is a framework for 3D LiDAR OOD detection that uses energy-based modeling with a shift-invariant scoring mechanism to differentiate between inliers and anomalous points.
- It integrates closed-set segmentation with a binary logistic loss on the relative energy gap, enhancing robustness in safety-critical autonomous driving systems.
- The approach incorporates a geometry-aware synthetic OOD generation method (Point Raise) to augment training data and improve detection metrics on benchmarks.
Relative Energy Learning (REL) is a framework for out-of-distribution (OOD) detection in 3D LiDAR point clouds, particularly designed for safety-critical autonomous driving environments where reliable identification of rare or anomalous objects is essential. Unlike prior approaches that adapt 2D image OOD methods to 3D data without success, REL introduces a shift-invariant energy scoring mechanism and a tailored synthetic OOD data strategy, yielding robust discrimination between inlier and anomalous points.
1. Energy-Based Modeling for LiDAR OOD Detection
REL builds on energy-based models, which assess sample “confidence” by computing energy scores from neural logits. Formally, given a segmentation network with logits for in-distribution classes:
where %%%%3%%%% is a temperature hyperparameter. OOD samples typically yield higher energies than inliers. Prior methods employ hinge losses with preset margins and temperature scaling, but calibration becomes tenuous under severe class imbalance and varying logit scales common in LiDAR tasks.
2. Relative Energy: Motivation, Definition, and Invariance
Relative Energy ameliorates the shift- and scale-sensitivity of raw energy scores by comparing the summed exponentials of positive (“inlier”) logits and negative (learned “OOD”) logits. The network is reparameterized to output $2K$ logits per point, with the first for in-distribution and the second for negative/OOD:
- Positive free energy:
- Negative free energy:
- Relative energy gap:
This ratio is shift-invariant, reducing the need for per-scene or per-backbone calibration, and reliably separates the energy distributions of inliers and OOD points even when classes are imbalanced ( for inliers; for OOD).
3. Integrated Training Objective
REL’s training combines closed-set segmentation, via the Mask4Former objective, and a binary logistic loss on the relative energy gap. The main terms are:
- Mask/class loss (): cross-entropy and Hungarian-matched mask overlap as in Mask4Former.
- REL OOD loss ():
Where is the set of inliers, is the set of synthetic OOD points from Point Raise, is an imbalance weight (100 typically), and is the sigmoid function. The combined objective is:
with by default. This encourages the relative energy gap to separate inlier and OOD populations.
4. Point Raise: Geometry-Aware Synthetic OOD Generation
The absence of annotated OOD points in training data is addressed by Point Raise, a lightweight geometry-aware synthesis algorithm. Its steps are:
- Select random road points from the point cloud , given labels .
- Sample a cluster within radius using KDTree(). Compute spatial distances and extract /.
- Apply an inward pull: adaptive decay , then scale each point’s coordinates by with .
- Assign random heights to cluster points: , updating .
- Relabel affected points as “RAISED_CLASS” (auxiliary OOD class).
Recommended hyperparameters are m, m, . This produces compact, object-like OOD clusters that do not overlap inlier semantics.
5. Network Architecture and REL Integration
The backbone is Mask4Former-3D: a Minkowski Sparse-UNet encoder, transformer decoder using FPS-sampled queries, producing panoptic masks for K inlier classes. REL’s OOD scoring is realized as:
- Auxiliary projector branch appended to every point’s encoder features
- 3 (Linear ReLU) layers yielding $2K$ logits
- First : positive logits for in-distribution classes
- Next : negative logits for OOD
- Relative energy gap computed per point and used for OOD decision
During inference, segmentation proceeds using Mask4Former; REL assigns an OOD score via for each point.
6. Empirical Performance and Ablation Results
REL has been evaluated on STU (Spotting the Unexpected) and SemanticKITTI benchmarks using both point- and object-level OOD metrics. Benchmarks and baselines include Deep Ensemble, MC Dropout, Max Logit (MSP), Void Classifier, RbA, and UEM. Results include:
| Dataset / Metric | AUROC (↑) | FPR@95% (↓) | AP (↑) |
|---|---|---|---|
| STU val (REL) | 97.85 | 9.60 | 10.68 |
| STU val (UEM) | 95.80 | 26.37 | 6.78 |
| STU test (REL) | 96.26 | 21.69 | 10.17 |
| STU test (Void) | 85.99 | 78.60 | 3.92 |
| KITTI outlier (REL) | 96.76 | 18.66 | 67.32 |
| KITTI outlier (UEM) | 93.15 | 37.07 | 61.73 |
Ablations demonstrate that REL’s relative energy yields the highest AUROC and lowest FPR@95 compared to hinge, VOS, and dual energy losses. Backbone finetuning offers further gains (AUROC: frozen $94.43$, finetuned $97.85$), while Point Raise is vital ( best; no synthesis $90.20$ AUROC, $38.31$ FPR@95).
7. Thresholding and Deployment in Safety-Critical Systems
For operational use, REL applies a simple decision rule: classify point as OOD if . A zero-centered threshold () naturally splits inliers from OOD. For controlled false-positive rates (e.g., FPR@95%), can be empirically selected on validation data; this calibration robustly generalizes across scenes and backbones without temperature/margin adjustment. In deployment, maintaining a running histogram of inlier enables dynamic threshold adaptation to keep FPR within safety limits.
The shift invariance and universal thresholding of REL streamline the integration of OOD detection into safety-critical autonomous driving stacks, mitigating overconfident errors and enabling robust behavior under open-world uncertainty.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free