Papers
Topics
Authors
Recent
Search
2000 character limit reached

DoGFlow: Radar-Guided LiDAR Scene Flow

Updated 4 July 2026
  • The paper introduces DoGFlow, a self-supervised framework that transfers motion cues from radar Doppler to LiDAR via a cross-modal pseudo-labeling process.
  • It employs a novel two-stage architecture with radar-based dynamic clustering and range-adaptive LiDAR association to resolve motion ambiguity.
  • Extensive experiments on MAN TruckScenes demonstrate robust, long-range scene flow estimation with over 90% recovery of fully supervised performance using only 10% labels.

Searching arXiv for the named paper and closely related scene-flow work to ground the article. arxiv_search(query="DoGFlow Self-Supervised LiDAR Scene Flow via Cross-Modal Doppler Guidance", max_results=5) DoGFlow is a self-supervised LiDAR scene flow framework that transfers motion information from 4D mmWave radar Doppler to LiDAR in order to generate dense 3D scene flow pseudo-labels without manual ground-truth annotation. It is formulated as a cross-modal, training-free pseudo-labeler coupled to a training pipeline for LiDAR backbones, with radar used only during pseudo-label generation and not at inference. In the reported MAN TruckScenes experiments, DoGFlow substantially outperforms prior self-supervised baselines, remains competitive at long range, and enables a LiDAR backbone to recover over 90%90\% of fully supervised performance with only 10%10\% of the ground-truth labels (Khoche et al., 25 Aug 2025).

1. Problem setting and motivation

LiDAR scene flow estimates dense 3D motion between successive point clouds. In autonomous driving, the task is directly relevant to detection, tracking, segmentation, and prediction, and the paper emphasizes that long-range motion in the 50200 m50\text{–}200\ \mathrm{m} regime and robustness to rain, snow, and fog are particularly important for safe planning (Khoche et al., 25 Aug 2025).

DoGFlow is motivated by limitations in both dominant supervision regimes. Fully supervised approaches can achieve strong performance, but their dependence on expensive human labeling creates a scaling bottleneck, and long-range boundaries and adverse-weather scenarios remain underrepresented in labeled datasets. LiDAR-only self-supervised approaches instead depend on geometric correspondences such as Chamfer or cycle consistency, which deteriorate when LiDAR geometry becomes sparse or noisy, often producing over-smoothing or unstable optimization. DoGFlow addresses this gap by exploiting 4D radar Doppler, which is robust to adverse weather and directly measures radial velocity, while explicitly handling the fact that Doppler is radial-only and susceptible to multipath and ghost returns (Khoche et al., 25 Aug 2025).

A central design choice is therefore cross-modal label transfer rather than direct radar inference at deployment time. The reported system uses radar to recover object-level motion cues where LiDAR-only geometric matching is weakest, then transfers those cues into the LiDAR domain to supervise a conventional feedforward LiDAR model. This suggests a separation between an offline label-generation stage and a radar-free online inference stage.

2. Two-stage architecture

DoGFlow is organized as a two-stage pipeline. The first stage estimates cluster-level velocities from radar; the second propagates those velocities to LiDAR points through dynamic-aware association and ambiguity-resolved label propagation (Khoche et al., 25 Aug 2025).

In the radar stage, Doppler measurements are first ego-motion compensated and used to detect dynamic radar points. Dynamic points are then grouped by graph-based clustering, specifically Connected Components Labeling, in a joint space of position and compensated velocity. Each connected component is treated as a radar cluster corresponding to a moving object, and a full 3D translational velocity is estimated for that cluster by least squares under physically plausible bounds.

In the LiDAR stage, points are denoised and ground is removed before cross-modal association. LiDAR points are associated to dynamic radar points with a range-adaptive nearest-neighbor rule. High-intensity associated LiDAR points are clustered with HDBSCAN, and low-intensity points are reintegrated into the nearest cluster within a fixed neighbor radius. A LiDAR cluster is labeled dynamic if a majority of its associated radar points are dynamic. The resulting radar-derived velocity or velocities are then transferred to the LiDAR cluster. When multiple radar clusters are associated to the same LiDAR cluster, DoGFlow resolves the ambiguity by forward-projecting the LiDAR cluster under each candidate velocity and selecting the one that best matches the next LiDAR scan under Chamfer distance (Khoche et al., 25 Aug 2025).

This architecture is notable for avoiding global bipartite assignment. The paper states that no global Hungarian matching is required because the association is local and many-to-one at the cluster level.

3. Radar kinematics, ego compensation, and ambiguity resolution

DoGFlow operates in calibrated sensor and ego frames. Let EE denote the ego or vehicle frame and SiS_i a radar frame. Using the extrinsic calibration TEST_{E\leftarrow S}, points are transformed as

xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.

TruckScenes provides synchronized, time-stamped multi-LiDAR and multi-4D-radar at 10 Hz10\ \mathrm{Hz}, and the framework fuses all points in the ego frame at each timestamp before association (Khoche et al., 25 Aug 2025).

The radar signal model begins with the Doppler relation

fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},

where vrv_r is radial velocity. For a radar return 10%10\%0 with position 10%10\%1, measured Doppler radial velocity 10%10\%2, and unit line of sight 10%10\%3, DoGFlow ego-compensates Doppler as

10%10\%4

and marks a radar point dynamic when 10%10\%5. The implementation sets 10%10\%6 (Khoche et al., 25 Aug 2025).

Dynamic radar clustering enforces both spatial and Doppler coherence. Two radar points 10%10\%7 are connected when

10%10\%8

with 10%10\%9 and 50200 m50\text{–}200\ \mathrm{m}0. For each cluster 50200 m50\text{–}200\ \mathrm{m}1, DoGFlow estimates a constant 3D translational velocity 50200 m50\text{–}200\ \mathrm{m}2 by solving 50200 m50\text{–}200\ \mathrm{m}3 with bound-constrained least squares. The paper also presents the more general rigid-body model

50200 m50\text{–}200\ \mathrm{m}4

and the corresponding Doppler constraint

50200 m50\text{–}200\ \mathrm{m}5

but states that DoGFlow instantiates a translation-only model for robustness under sparse and noisy radar and short baselines (Khoche et al., 25 Aug 2025).

Cross-modal propagation uses a range-adaptive gate 50200 m50\text{–}200\ \mathrm{m}6 that increases linearly with range from 50200 m50\text{–}200\ \mathrm{m}7 to 50200 m50\text{–}200\ \mathrm{m}8. High-intensity LiDAR points are selected using 50200 m50\text{–}200\ \mathrm{m}9, and low-intensity neighbors are reassigned within EE0. Dynamic labeling is then determined by majority voting: if more than EE1 of the associated radar points for a LiDAR cluster are dynamic, the cluster is marked dynamic. When several radar clusters map to the same LiDAR cluster, the ambiguity is resolved by minimizing symmetric Chamfer distance after forward projection:

EE2

with EE3 and EE4. The chosen velocity is then propagated as EE5 (Khoche et al., 25 Aug 2025).

4. Pseudo-label generation and supervision of LiDAR backbones

For a LiDAR point EE6, scene flow is defined as

EE7

With known ego-motion, the model predicts residual non-ego flow through

EE8

DoGFlow supplies dense pseudo-labels EE9 for LiDAR points in dynamic clusters, and optionally zeros for static points. These labels supervise a standard LiDAR backbone; the paper uses SSF as the main feedforward model and also compares to DeFlow (Khoche et al., 25 Aug 2025).

The primary training objective is direct regression to pseudo-labels:

SiS_i0

In the reported experiments, the weights are set to SiS_i1 for points with labels, and unlabeled points are ignored. The paper notes that forward–backward consistency, smoothness regularization, and occlusion or warping losses are compatible standard extensions, but they are not required by DoGFlow’s core pipeline (Khoche et al., 25 Aug 2025).

A practical consequence of this design is that radar is absent at deployment. The training-free DoGFlow pseudo-labeler runs offline, after which a feedforward LiDAR model can be trained and deployed in real time. On an RTX 3090, the paper reports SiS_i2 per frame and peak memory below SiS_i3 for the labeler, versus SiS_i4 per frame for the feedforward SSF trained on DoGFlow labels (Khoche et al., 25 Aug 2025).

5. Dataset, metrics, and empirical results

The experiments are conducted on MAN TruckScenes, which contains SiS_i5 scenes of approximately SiS_i6 each at SiS_i7, with clear, overcast, rain, snow, and fog conditions, multiple long-range LiDARs, and modern 4D mmWave radars with SiS_i8 coverage. The training split has SiS_i9 frames, and 3D boxes are available every fifth frame. The paper explicitly notes that existing scene-flow benchmarks lack 4D radar, making TruckScenes well suited to cross-modal Doppler-guided supervision (Khoche et al., 25 Aug 2025).

Evaluation uses EPE3D, range-wise dynamic EPE, dynamic IoU, and three-way EPE. EPE3D is the per-point TEST_{E\leftarrow S}0 error TEST_{E\leftarrow S}1. Dynamic IoU is computed on a dynamic mask defined by TEST_{E\leftarrow S}2 per frame. Three-way EPE averages over Foreground Dynamic, Foreground Static, and Background Static (Khoche et al., 25 Aug 2025).

On the TruckScenes validation set, DoGFlow reports a three-way EPE of TEST_{E\leftarrow S}3, identified as best among the self-supervised methods in the table and TEST_{E\leftarrow S}4 better than FastNSF’s TEST_{E\leftarrow S}5. Its range-wise dynamic EPE is TEST_{E\leftarrow S}6 for TEST_{E\leftarrow S}7 and TEST_{E\leftarrow S}8 for TEST_{E\leftarrow S}9, while dynamic IoU is xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.0 and xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.1 in the same bins, respectively. The paper emphasizes that DoGFlow degrades slowly with range and remains competitive at xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.2, where Chamfer-based methods fail because of sparsity and occlusion (Khoche et al., 25 Aug 2025).

When pseudo-labels are used to train SSF, DoGFlow again exceeds alternative pseudo-label sources. The reported dynamic EPE for SSF trained on DoGFlow pseudo-labels is xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.3 at xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.4 and xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.5 at xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.6, with dynamic IoU of xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.7 and xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.8. For comparison, FastNSF pseudo-labels yield xE=RESxS+tES.x_E = R_{E\leftarrow S} x_S + t_{E\leftarrow S}.9 dynamic EPE and 10 Hz10\ \mathrm{Hz}0 dynamic IoU, while ICP-Flow pseudo-labels yield 10 Hz10\ \mathrm{Hz}1 and 10 Hz10\ \mathrm{Hz}2 (Khoche et al., 25 Aug 2025).

The paper’s label-efficiency result is one of its strongest reported findings. SSF pretrained with DoGFlow pseudo-labels and then fine-tuned with only 10 Hz10\ \mathrm{Hz}3 ground truth reaches mean dynamic EPE 10 Hz10\ \mathrm{Hz}4, compared with 10 Hz10\ \mathrm{Hz}5 for fully supervised SSF, which the paper summarizes as over 10 Hz10\ \mathrm{Hz}6 of fully supervised performance with 10 Hz10\ \mathrm{Hz}7 labels. Zero-shot performance after DoGFlow pretraining, at 10 Hz10\ \mathrm{Hz}8, is described as comparable to training from scratch with 10 Hz10\ \mathrm{Hz}9 ground truth, at fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},0 (Khoche et al., 25 Aug 2025).

In adverse weather, the mean range-wise dynamic EPE and IoU for DoGFlow are reported as follows: clear fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},1, overcast fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},2, rain fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},3, snow fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},4, and fog fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},5. The paper states that DoGFlow strongly outperforms Chamfer-based self-supervised baselines in IoU across all weather conditions and achieves the best or second-best EPE in most weather categories, with particularly strong snow performance attributed to radar robustness (Khoche et al., 25 Aug 2025).

An ablation further isolates the value of Doppler-based dynamic awareness. Replacing DUFOMap with radar-based dynamic classification inside SeFlow improves dynamic EPE by fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},6 at fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},7 and fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},8 at fD=2fccvr,vr=cfD2fc,f_D = \frac{2 f_c}{c} v_r,\qquad v_r = \frac{c f_D}{2 f_c},9, while improving dynamic IoU by vrv_r0 and vrv_r1, respectively (Khoche et al., 25 Aug 2025).

6. Limitations, deployment considerations, and nomenclature

The paper identifies several failure modes. DoGFlow is sensitive to calibration because it requires accurate, static extrinsics between radar and LiDAR; miscalibration produces systematic velocity bias and association error. Close-range and slow movers remain difficult because 4D radar Doppler resolution and sensor blind spots can miss low-speed lateral motion, such as pedestrians near the vehicle sides. Rotational motion is not modeled explicitly because DoGFlow estimates only cluster-level constant translation, so significant within-object rotation cannot be recovered from radial Doppler alone under the chosen formulation. Severe multipath and aliasing can still corrupt the candidate velocity set despite the ambiguity-resolution mechanism. Finally, the training-free labeler is not real time at roughly vrv_r2 per frame, so the intended use is offline pseudo-label generation followed by real-time LiDAR inference at roughly vrv_r3 per frame (Khoche et al., 25 Aug 2025).

These limitations directly shape deployment practice. The recommended integration is to use DoGFlow offline with a multi-4D-radar and multi-LiDAR sensor suite, generate large-scale pseudo-labels under synchronized timestamps and precise extrinsic calibration, and then train a sparse-convolution LiDAR scene-flow backbone such as SSF or DeFlow. The paper notes that graph-based radar clustering and Chamfer-based ambiguity resolution dominate runtime and are natural targets for acceleration through spatial indexing and GPU nearest-neighbor search. It also identifies range-adaptive association, intensity-aware LiDAR clustering, and majority voting as crucial to weather and long-range robustness (Khoche et al., 25 Aug 2025).

The term itself requires disambiguation. In (Khoche et al., 25 Aug 2025), DoGFlow is the official name of the LiDAR scene-flow method based on cross-modal Doppler guidance. However, the later time-series paper DoFlow: Causal Generative Flows for Interventional and Counterfactual Time-Series Prediction” states that “DoGFlow” is sometimes used informally to refer to its own method, whose official name is DoFlow rather than a separate model (Wu et al., 4 Nov 2025). A further neighboring name, DiG-Flow, denotes an unrelated discrepancy-guided regularization framework for Vision-Language-Action policies (Zhang et al., 1 Dec 2025). A plausible implication is that “DoGFlow” should be interpreted by domain context: in autonomous-driving perception it refers to cross-modal radar-guided LiDAR scene flow, whereas nearby flow-model literature contains unrelated naming collisions.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to DoGFlow.