Papers
Topics
Authors
Recent
2000 character limit reached

D-KalmanNet: Doppler Kalman Neural Network

Updated 6 December 2025
  • The paper introduces D-KalmanNet, a recurrent neural architecture that fuses physics-based Gaussian state-space models with neural gain learning for agile tracking of dynamic obstacles.
  • It employs Doppler velocity rectification to fuse instantaneous LiDAR measurements into a robust observation model, ensuring accurate and low-latency state estimation.
  • Empirical results on real-world datasets show D-KalmanNet outperforms both classical and neural baseline filters in NMSE and processing efficiency for dynamic motion planning.

The Doppler Kalman Neural Network (D-KalmanNet) is a recurrent neural architecture designed for high-frequency, data-efficient tracking of dynamic obstacles in environments sensed by Doppler LiDAR. It integrates physically grounded Gaussian state-space modeling with neural gain learning, enabling accurate and low-latency state estimation in motion planning systems such as DPNet. D-KalmanNet employs Doppler velocity rectification to fuse instantaneous velocity measurements into a robust estimation pipeline, supporting agile and robust motion planning in highly dynamic scenarios (Zuo et al., 29 Nov 2025).

1. Probabilistic State-Space Model

D-KalmanNet employs a partially observable Gaussian state-space (GSS) model tailored for Doppler LiDAR-based tracking. Each obstacle nn at time tt is described by a Doppler-augmented 6-dimensional state: xtn=[xtn cos(θtn)vtn cos(θtn)atn ytn sin(θtn)vtn sin(θtn)atn]R6,\mathbf{x}_t^n = \begin{bmatrix} x_t^n \ \cos(\theta_t^n)\,v_t^n \ \cos(\theta_t^n)\,a_t^n \ y_t^n \ \sin(\theta_t^n)\,v_t^n \ \sin(\theta_t^n)\,a_t^n \end{bmatrix} \in\mathbb{R}^6, where (xtn,ytn)(x_t^n, y_t^n) are spatial coordinates, θtn\theta_t^n is orientation, vtnv_t^n is scalar speed, and atna_t^n is acceleration. This formulation directly incorporates orientation and kinematic quantities, facilitating physically interpretable tracking.

Observations at each LiDAR sweep are defined as: ytn=[xtn ytn v^tn]R3,\mathbf{y}_t^n = \begin{bmatrix} x_t^n \ y_t^n \ \widehat v_t^n \end{bmatrix} \in\mathbb{R}^3, where v^tn\widehat v_t^n is a Doppler-fused velocity estimate.

The state and observation dynamics are modeled as: xt+1n=Txtn+wtn,wtnN(0,Q)\mathbf{x}_{t+1}^n = \mathbf{T}\mathbf{x}_t^n + \mathbf{w}_t^n,\quad \mathbf{w}_t^n\sim \mathcal{N}(0, \mathbf{Q})

ytn=Uxtn+vtn,vtnN(0,R)\mathbf{y}_t^n = \mathbf{U}\mathbf{x}_t^n + \mathbf{v}_t^n,\quad \mathbf{v}_t^n\sim \mathcal{N}(0, \mathbf{R})

Here, TR6×6\mathbf{T}\in \mathbb{R}^{6\times 6} represents block-diagonal constant acceleration transitions for xx and yy, and UR3×6\mathbf{U}\in\mathbb{R}^{3\times6} projects the state to measured position and speed. Process and measurement covariances Q\mathbf{Q}, R\mathbf{R} are not required to be explicitly computed.

2. Kalman Filter-inspired Algorithmic Structure

D-KalmanNet retains the two-step structure of the Kalman filter, but replaces the analytical Kalman gain with a neural network output. The prediction (time) update is: x^t+1tn=Txtn,y^t+1tn=Ux^t+1tn\hat{\mathbf{x}}_{t+1|t}^n = \mathbf{T}\mathbf{x}_t^n, \qquad \hat{\mathbf{y}}_{t+1|t}^n = \mathbf{U}\hat{\mathbf{x}}_{t+1|t}^n

The correction (measurement) update is: xt+1n=x^t+1tn+Kt+1n(yt+1ny^t+1tn),\mathbf{x}_{t+1}^n = \hat{\mathbf{x}}_{t+1|t}^n \,+\, \mathcal{K}_{t+1}^n \left(\mathbf{y}_{t+1}^n - \hat{\mathbf{y}}_{t+1|t}^n\right), where Kt+1n\mathcal{K}_{t+1}^n is the learned gain.

The Kalman gain is not computed by forward-propagating uncertainty, but is learned directly by a recurrent neural network (RNN) from the input tuple (xtn,x^t+1tn,y^t+1tn,yt+1n)(\mathbf{x}_t^n,\,\hat{\mathbf{x}}_{t+1|t}^n,\,\hat{\mathbf{y}}_{t+1|t}^n,\,\mathbf{y}_{t+1}^n).

3. Neural Network Architecture for Gain Learning

The gain learning module of D-KalmanNet utilizes a gated recurrent unit (GRU) as its backbone. Key features:

  • Input: The concatenated $18$-dimensional vector of the posterior state, prior prediction, predicted measurement, and current Doppler observation.
  • Hidden size: Typically in the range 32–128; the exact configuration follows KalmanNet conventions.
  • Output: A fully connected linear layer maps GRU outputs to a flattened 6×3=186\times3=18 vector, reshaped to the gain matrix.
  • Activations: The GRU employs tanh\tanh internally; the output layer is linear.
  • Integration: At each time step, the fused Doppler measurement and prior predictions are input to the RNN to yield the learned Kalman gain.

This learned gain approach adapts to changes in motion dynamics and measurement quality, directly correcting for real-world nonlinearities and model mismatches within a physical Gaussian state-space.

4. Doppler Velocity Rectification and Observation Processing

Raw radial Doppler LiDAR lines are aggregated per obstacle by projecting all points inside each obstacle’s bounding box into a unified 2D velocity direction, then averaging. This Doppler velocity rectification (specifically Eqs. 7–8 in DPNet) reduces noise in FMCW measurements (typical σ0.1m/s\sigma\approx 0.1\,\mathrm{m/s} as per manufacturer specification) and mitigates measurement bias. The resulting fused 2D velocity estimate is pivotal for robust speed and heading estimation, enabling the subsequent Kalman loop to remain robust under noise and partial observations.

5. Training Methodology

D-KalmanNet is trained on real-world FMCW-LiDAR sequences paired with ground-truth 2D trajectories (AevaScenes benchmark). The loss function is a mean squared error over predicted future (x,y)(x,y) positions: Ltraj=1NHn=1Nh=1Hp^t+htnpt+hn22\mathcal{L}_{\mathrm{traj}} = \frac{1}{NH}\sum_{n=1}^N\sum_{h=1}^H \left\| \,\hat{\mathbf{p}}_{t+h|t}^n - \mathbf{p}_{t+h}^n\,\right\|_2^2 where HH is the prediction horizon.

Optimization is performed using Adam with learning rate 10310^{-3} (decayed on plateau), batch size 32\approx 32, weight decay 10410^{-4}, and dropout $0.1$ on the output layer over 2000 epochs.

6. Empirical Performance and Ablation Analysis

D-KalmanNet achieves significant improvements over both classical and neural baselines. On the AevaScenes highway dataset with prediction horizon H=5H=5 and 10 Hz update rate, D-KalmanNet obtains 35.80±8.62-35.80\pm8.62 dB NMSE, versus 27.15±6.39-27.15\pm6.39 dB for Doppler-aided KF, 23.38±7.82-23.38\pm7.82 dB for KalmanNet, and 22.66±6.41-22.66\pm6.41 dB for vanilla KF (Table III, (Zuo et al., 29 Nov 2025)). At longer horizons (H=10H=10), D-KalmanNet retains a 5–12 dB advantage, with no rapid error saturation.

Hardware efficiency is demonstrated by real-time tracking of up to 10 obstacles at 14.4 Hz on Jetson Orin NX, utilizing only 42% CPU and approximately 100 MB GPU memory (Table IV).

Ablation studies show the importance of the Doppler-grouping process and the learned gain: reverting to pointwise radial velocity inputs or analytical gain computation causes a 6–10 dB NMSE performance degradation and impairs robustness at low update rates.

Model Variant NMSE (dB) Special Notes
D-KalmanNet 35.80±8.62-35.80\pm8.62 Highway, H=5H=5, 10 Hz
Doppler-aided KF 27.15±6.39-27.15\pm6.39
KalmanNet 23.38±7.82-23.38\pm7.82
Vanilla KF 22.66±6.41-22.66\pm6.41

7. Key Innovations and Significance

D-KalmanNet advances prior work in both filtering and neural state estimation by:

  • Physical interpretability: By embedding Doppler-derived velocity and acceleration in the state, it enables direct, model-based interpretation of orientation, speed, and maneuvering.
  • Neural gain learning: Only the Kalman gain is learned, not the transition model. This yields data efficiency and interpretability while allowing robust adaptation to real-world noise and nonlinearity.
  • Measurement fusion: Doppler velocity rectification integrates instantaneous point velocities into a stable observation vector, outperforming per-point or naïve RNN approaches.
  • Efficiency: The architecture enables 15–100 Hz updates on embedded hardware, suitable for online planning in highly dynamic environments.

A plausible implication is that D-KalmanNet’s design principles—neural augmentation of classical state estimators under physically meaningful observation models—can generalize to other robotic perception domains where sensor fusion and partial observability are dominant challenges (Zuo et al., 29 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Doppler Kalman Neural Network (D-KalmanNet).