D-KalmanNet: Doppler Kalman Neural Network
- The paper introduces D-KalmanNet, a recurrent neural architecture that fuses physics-based Gaussian state-space models with neural gain learning for agile tracking of dynamic obstacles.
- It employs Doppler velocity rectification to fuse instantaneous LiDAR measurements into a robust observation model, ensuring accurate and low-latency state estimation.
- Empirical results on real-world datasets show D-KalmanNet outperforms both classical and neural baseline filters in NMSE and processing efficiency for dynamic motion planning.
The Doppler Kalman Neural Network (D-KalmanNet) is a recurrent neural architecture designed for high-frequency, data-efficient tracking of dynamic obstacles in environments sensed by Doppler LiDAR. It integrates physically grounded Gaussian state-space modeling with neural gain learning, enabling accurate and low-latency state estimation in motion planning systems such as DPNet. D-KalmanNet employs Doppler velocity rectification to fuse instantaneous velocity measurements into a robust estimation pipeline, supporting agile and robust motion planning in highly dynamic scenarios (Zuo et al., 29 Nov 2025).
1. Probabilistic State-Space Model
D-KalmanNet employs a partially observable Gaussian state-space (GSS) model tailored for Doppler LiDAR-based tracking. Each obstacle at time is described by a Doppler-augmented 6-dimensional state: where are spatial coordinates, is orientation, is scalar speed, and is acceleration. This formulation directly incorporates orientation and kinematic quantities, facilitating physically interpretable tracking.
Observations at each LiDAR sweep are defined as: where is a Doppler-fused velocity estimate.
The state and observation dynamics are modeled as:
Here, represents block-diagonal constant acceleration transitions for and , and projects the state to measured position and speed. Process and measurement covariances , are not required to be explicitly computed.
2. Kalman Filter-inspired Algorithmic Structure
D-KalmanNet retains the two-step structure of the Kalman filter, but replaces the analytical Kalman gain with a neural network output. The prediction (time) update is:
The correction (measurement) update is: where is the learned gain.
The Kalman gain is not computed by forward-propagating uncertainty, but is learned directly by a recurrent neural network (RNN) from the input tuple .
3. Neural Network Architecture for Gain Learning
The gain learning module of D-KalmanNet utilizes a gated recurrent unit (GRU) as its backbone. Key features:
- Input: The concatenated $18$-dimensional vector of the posterior state, prior prediction, predicted measurement, and current Doppler observation.
- Hidden size: Typically in the range 32–128; the exact configuration follows KalmanNet conventions.
- Output: A fully connected linear layer maps GRU outputs to a flattened vector, reshaped to the gain matrix.
- Activations: The GRU employs internally; the output layer is linear.
- Integration: At each time step, the fused Doppler measurement and prior predictions are input to the RNN to yield the learned Kalman gain.
This learned gain approach adapts to changes in motion dynamics and measurement quality, directly correcting for real-world nonlinearities and model mismatches within a physical Gaussian state-space.
4. Doppler Velocity Rectification and Observation Processing
Raw radial Doppler LiDAR lines are aggregated per obstacle by projecting all points inside each obstacle’s bounding box into a unified 2D velocity direction, then averaging. This Doppler velocity rectification (specifically Eqs. 7–8 in DPNet) reduces noise in FMCW measurements (typical as per manufacturer specification) and mitigates measurement bias. The resulting fused 2D velocity estimate is pivotal for robust speed and heading estimation, enabling the subsequent Kalman loop to remain robust under noise and partial observations.
5. Training Methodology
D-KalmanNet is trained on real-world FMCW-LiDAR sequences paired with ground-truth 2D trajectories (AevaScenes benchmark). The loss function is a mean squared error over predicted future positions: where is the prediction horizon.
Optimization is performed using Adam with learning rate (decayed on plateau), batch size , weight decay , and dropout $0.1$ on the output layer over 2000 epochs.
6. Empirical Performance and Ablation Analysis
D-KalmanNet achieves significant improvements over both classical and neural baselines. On the AevaScenes highway dataset with prediction horizon and 10 Hz update rate, D-KalmanNet obtains dB NMSE, versus dB for Doppler-aided KF, dB for KalmanNet, and dB for vanilla KF (Table III, (Zuo et al., 29 Nov 2025)). At longer horizons (), D-KalmanNet retains a 5–12 dB advantage, with no rapid error saturation.
Hardware efficiency is demonstrated by real-time tracking of up to 10 obstacles at 14.4 Hz on Jetson Orin NX, utilizing only 42% CPU and approximately 100 MB GPU memory (Table IV).
Ablation studies show the importance of the Doppler-grouping process and the learned gain: reverting to pointwise radial velocity inputs or analytical gain computation causes a 6–10 dB NMSE performance degradation and impairs robustness at low update rates.
| Model Variant | NMSE (dB) | Special Notes |
|---|---|---|
| D-KalmanNet | Highway, , 10 Hz | |
| Doppler-aided KF | ||
| KalmanNet | ||
| Vanilla KF |
7. Key Innovations and Significance
D-KalmanNet advances prior work in both filtering and neural state estimation by:
- Physical interpretability: By embedding Doppler-derived velocity and acceleration in the state, it enables direct, model-based interpretation of orientation, speed, and maneuvering.
- Neural gain learning: Only the Kalman gain is learned, not the transition model. This yields data efficiency and interpretability while allowing robust adaptation to real-world noise and nonlinearity.
- Measurement fusion: Doppler velocity rectification integrates instantaneous point velocities into a stable observation vector, outperforming per-point or naïve RNN approaches.
- Efficiency: The architecture enables 15–100 Hz updates on embedded hardware, suitable for online planning in highly dynamic environments.
A plausible implication is that D-KalmanNet’s design principles—neural augmentation of classical state estimators under physically meaningful observation models—can generalize to other robotic perception domains where sensor fusion and partial observability are dominant challenges (Zuo et al., 29 Nov 2025).