Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 109 tok/s
Gemini 3.0 Pro 52 tok/s Pro
Gemini 2.5 Flash 159 tok/s Pro
Kimi K2 203 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Relative Energy Learning in 3D LiDAR OOD

Updated 17 November 2025
  • REL is a framework for 3D LiDAR OOD detection that uses energy-based modeling with a shift-invariant scoring mechanism to differentiate between inliers and anomalous points.
  • It integrates closed-set segmentation with a binary logistic loss on the relative energy gap, enhancing robustness in safety-critical autonomous driving systems.
  • The approach incorporates a geometry-aware synthetic OOD generation method (Point Raise) to augment training data and improve detection metrics on benchmarks.

Relative Energy Learning (REL) is a framework for out-of-distribution (OOD) detection in 3D LiDAR point clouds, particularly designed for safety-critical autonomous driving environments where reliable identification of rare or anomalous objects is essential. Unlike prior approaches that adapt 2D image OOD methods to 3D data without success, REL introduces a shift-invariant energy scoring mechanism and a tailored synthetic OOD data strategy, yielding robust discrimination between inlier and anomalous points.

1. Energy-Based Modeling for LiDAR OOD Detection

REL builds on energy-based models, which assess sample “confidence” by computing energy scores from neural logits. Formally, given a segmentation network with logits f(x)=[f1(x),,fC(x)]Tf(x) = [f_1(x),\ldots,f_C(x)]^T for CC in-distribution classes:

E(x)=Tlog[i=1Cefi(x)/T]E(x) = -T \cdot \log \left[ \sum_{i=1}^C e^{f_i(x)/T} \right]

where %%%%3%%%% is a temperature hyperparameter. OOD samples typically yield higher energies than inliers. Prior methods employ hinge losses with preset margins and temperature scaling, but calibration becomes tenuous under severe class imbalance and varying logit scales common in LiDAR tasks.

2. Relative Energy: Motivation, Definition, and Invariance

Relative Energy ameliorates the shift- and scale-sensitivity of raw energy scores by comparing the summed exponentials of positive (“inlier”) logits and negative (learned “OOD”) logits. The network is reparameterized to output $2K$ logits per point, with the first KK for in-distribution and the second KK for negative/OOD:

  • Positive free energy: Epos(x)=logiy+efi(x)E_{pos}(x) = -\log \sum_{i \in y^+} e^{f_i(x)}
  • Negative free energy: Eneg(x)=logiyefi(x)E_{neg}(x) = -\log \sum_{i \in y^-} e^{f_i(x)}
  • Relative energy gap:

ΔE(x)=Eneg(x)Epos(x)=log[iyefi(x)iy+efi(x)]\Delta E(x) = E_{neg}(x) - E_{pos}(x) = \log \left[ \frac{\sum_{i \in y^-} e^{f_i(x)}}{\sum_{i \in y^+} e^{f_i(x)}} \right]

This ratio is shift-invariant, reducing the need for per-scene or per-backbone calibration, and reliably separates the energy distributions of inliers and OOD points even when classes are imbalanced (ΔE(x)0\Delta E(x) \ll 0 for inliers; ΔE(x)0\Delta E(x) \gtrsim 0 for OOD).

3. Integrated Training Objective

REL’s training combines closed-set segmentation, via the Mask4Former objective, and a binary logistic loss on the relative energy gap. The main terms are:

  • Mask/class loss (Lcls\mathcal{L}_{cls}): cross-entropy and Hungarian-matched mask overlap as in Mask4Former.
  • REL OOD loss (LREL\mathcal{L}_{REL}):

LREL=ExDin[logσ(ΔE(x))]+ωExDaux[logσ(ΔE(x))]\mathcal{L}_{REL} = E_{x \sim D_{in}}[-\log \sigma(-\Delta E(x))] + \omega E_{x \sim D_{aux}}[-\log \sigma(\Delta E(x))]

Where DinD_{in} is the set of inliers, DauxD_{aux} is the set of synthetic OOD points from Point Raise, ω\omega is an imbalance weight (100 typically), and σ(z)\sigma(z) is the sigmoid function. The combined objective is:

Ltotal=Lcls+λLREL\mathcal{L}_{total} = \mathcal{L}_{cls} + \lambda \cdot \mathcal{L}_{REL}

with λ=1\lambda=1 by default. This encourages the relative energy gap to separate inlier and OOD populations.

4. Point Raise: Geometry-Aware Synthetic OOD Generation

The absence of annotated OOD points in training data is addressed by Point Raise, a lightweight geometry-aware synthesis algorithm. Its steps are:

  • Select random road points from the point cloud PP, given labels LL.
  • Sample a cluster within radius rr using KDTree(PP). Compute spatial distances P[cluster]2\|P[cluster]\|_2 and extract dmind_{min}/dmaxd_{max}.
  • Apply an inward pull: adaptive decay a=log(dmin/dmax)/[γ(dmaxdmin)]a= -\log(d_{min}/d_{max}) / [\gamma (d_{max} - d_{min})], then scale each point’s xyxy coordinates by s=exp(adshift)s=\exp(-a \cdot d_{shift}) with dshift=d(p)dmind_{shift} = d(p)-d_{min}.
  • Assign random heights to cluster points: hjU(hmin,hmax)h_j \sim U(h_{min}, h_{max}), updating P[pz]P[p_z].
  • Relabel affected points as “RAISED_CLASS” (auxiliary OOD class).

Recommended hyperparameters are rmin=rmax=0.25..0.75r_{min}=r_{max}=0.25..0.75\,m, hmin=hmax=0.25..0.75h_{min}=h_{max}=0.25..0.75\,m, γ=2\gamma=2. This produces compact, object-like OOD clusters that do not overlap inlier semantics.

5. Network Architecture and REL Integration

The backbone is Mask4Former-3D: a Minkowski Sparse-UNet encoder, transformer decoder using FPS-sampled queries, producing panoptic masks for K inlier classes. REL’s OOD scoring is realized as:

  • Auxiliary projector branch appended to every point’s encoder features
  • 3 ×\times (Linear \rightarrow ReLU) layers yielding $2K$ logits
    • First KK: positive logits fi(x)f_i(x) for in-distribution classes
    • Next KK: negative logits for OOD
  • Relative energy gap ΔE(x)\Delta E(x) computed per point and used for OOD decision

During inference, segmentation proceeds using Mask4Former; REL assigns an OOD score via ΔE(x)\Delta E(x) for each point.

6. Empirical Performance and Ablation Results

REL has been evaluated on STU (Spotting the Unexpected) and SemanticKITTI benchmarks using both point- and object-level OOD metrics. Benchmarks and baselines include Deep Ensemble, MC Dropout, Max Logit (MSP), Void Classifier, RbA, and UEM. Results include:

Dataset / Metric AUROC (↑) FPR@95% (↓) AP (↑)
STU val (REL) 97.85 9.60 10.68
STU val (UEM) 95.80 26.37 6.78
STU test (REL) 96.26 21.69 10.17
STU test (Void) 85.99 78.60 3.92
KITTI outlier (REL) 96.76 18.66 67.32
KITTI outlier (UEM) 93.15 37.07 61.73

Ablations demonstrate that REL’s relative energy yields the highest AUROC and lowest FPR@95 compared to hinge, VOS, and dual energy losses. Backbone finetuning offers further gains (AUROC: frozen $94.43$, finetuned $97.85$), while Point Raise is vital (γ=2\gamma=2 best; no synthesis $90.20$ AUROC, $38.31$ FPR@95).

7. Thresholding and Deployment in Safety-Critical Systems

For operational use, REL applies a simple decision rule: classify point xx as OOD if ΔE(x)>τ\Delta E(x) > \tau. A zero-centered threshold (τ=0\tau=0) naturally splits inliers from OOD. For controlled false-positive rates (e.g., FPR@95%), τ\tau can be empirically selected on validation data; this calibration robustly generalizes across scenes and backbones without temperature/margin adjustment. In deployment, maintaining a running histogram of inlier ΔE\Delta E enables dynamic threshold adaptation to keep FPR within safety limits.

The shift invariance and universal thresholding of REL streamline the integration of OOD detection into safety-critical autonomous driving stacks, mitigating overconfident errors and enabling robust behavior under open-world uncertainty.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Relative Energy Learning (REL).