Split-KalmanNet for Robust SLAM
- Split-KalmanNet is a model-based deep learning framework that integrates the EKF with dual RNN modules to enhance SLAM performance under significant model mismatch.
- It strategically decouples process and measurement uncertainties by employing separate RNNs, preserving geometric consistency through analytical Jacobians.
- Empirical evaluations demonstrate that Split-KalmanNet achieves near-MMSE performance across varied noise regimes, outperforming both traditional EKF and KalmanNet.
Split-KalmanNet is a robust model-based deep learning framework designed to address simultaneous localization and mapping (SLAM) tasks in the presence of significant model mismatch. Building on the classical extended Kalman filter (EKF) and recent hybrid approaches such as KalmanNet, Split-KalmanNet strategically integrates the EKF’s structural properties with two small recurrent neural networks (RNNs) to achieve state-of-the-art SLAM performance when system and measurement models are imprecisely known. The split architecture enables distinct compensation for process and measurement model uncertainties, allowing the SLAM estimator to maintain accuracy and stability across a wide range of nonideal regimes (Choi et al., 2022).
1. State-Space Model and EKF Limitations
SLAM is conventionally posed as a discrete-time nonlinear state-space estimation problem, in which the full state vector comprises both the robot pose and all landmark coordinates at time , and observations represent measurements such as range and bearing. The model takes the form:
- State evolution:
- Measurement:
where and are the nonlinear motion and measurement functions, is the control input, and , are the true process and measurement covariance matrices.
The extended Kalman filter performs recursive Bayesian estimation using linearizations:
- Prediction: 0 and 1, with 2
- Update: Calculate 3, 4, Kalman gain 5, then update 6 and 7.
EKF critically relies on accurate 8, 9, and correct linearization of 0 and 1. In real-world deployments, these are often poorly specified or nonstationary. Mismatched covariances or Jacobians cause severe degradation or divergence due to miscomputed Kalman gains.
2. Split-KalmanNet Architecture
Split-KalmanNet preserves the EKF’s two-step recursion but fundamentally changes the Kalman gain computation. Rather than analytically deriving the gain from assumed covariances, the method uses two separate lightweight RNNs to learn corrections for process (prior) and innovation covariances. The update steps are:
- Prediction: 2 (standard EKF).
- Jacobian Calculation: 3.
- Innovation Calculation: 4.
- Split Gain Assembly: 5, where 6, 7 are outputs of RNN #1 and RNN #2, respectively.
- State Update: 8.
Here, 9 approximates or corrects the prior covariance, and 0 estimates the inverse innovation covariance, letting each RNN focus on a distinct uncertainty source.
3. Kalman Gain: Classic vs. Learned Split Formulation
The standard EKF computes
- 1
- 2
- 3
Split-KalmanNet replaces 4 and 5 with neural approximators:
- RNN #1 ingests EKF-derived features to model (or correct) 6, outputting a matrix 7.
- RNN #2 produces 8, approximating 9 based on innovation statistics.
The split Kalman gain becomes 0, explicitly maintaining the geometric structure induced by 1 while flexibly adapting to errors in process and measurement modeling.
4. RNN Modules and Input Feature Design
Both RNNs operate as compact gated recurrent modules (e.g., GRU or LSTM), each with a specialized input pipeline:
- RNN #1 (Prior-Covariance Learner):
- Inputs: 2, 3, and optionally the difference 4 to capture linearization errors.
- Outputs: Matrix 5
- RNN #2 (Innovation-Covariance Learner):
- Inputs: 6, 7, with optional inclusion of 8.
- Outputs: Matrix 9
RNN #1 absorbs temporal patterns in prediction errors, targeting process noise and model mismatch in 0, 1. RNN #2 specializes in learning from actual and predicted innovations, compensating for discrepancies in 2, 3.
5. Training Protocol and Loss Function
Training encompasses both synthetic and complex SLAM datasets:
- Uniform Circular-Motion Benchmark: 2D agent in circular motion, with linear or polar measurements.
- SLAM Dataset: 5 landmarks randomly placed, control sequences drawn uniformly, trajectories for training (4, 5) and for testing (6, 7) with mismatched process and measurement noise levels.
Noises (8) are varied across orders of magnitude to foster robust learning. The end-to-end mean-squared error loss is
9
Optimization is performed via alternating stochastic gradient descent:
- Fix 0 and update 1 using gradients through 2.
- Fix 3 and update 4 via 5.
- Alternate until convergence, with early stopping via validation MSE. No explicit regularization is applied.
6. Empirical Performance
Extensive simulation results are reported for both canonical and SLAM-specific tasks:
- Linear Measurement (Circular Motion): For growing noise-variance ratio 6, classical EKF (with access to ground-truth models) remains at MMSE performance, while KalmanNet shows rapid degradation beyond 7. In contrast, Split-KalmanNet achieves MMSE up to 8.
- Nonlinear Measurement: Split-KalmanNet slightly outperforms EKF when noise is small (exploiting linearization error) and remains robust for large 9, where KalmanNet fails.
- Online SLAM (MSE, dB): Across 0, Split-KalmanNet matches perfect EKF at 1. EKF with mismatched models is 2 worse; KalmanNet is 3 worse. For heterogeneity in range vs. bearing noise (4), Split-KalmanNet consistently aligns with perfect EKF, outperforming mismatched EKF by 5, and KalmanNet by 6.
7. Impact and Structural Advantages
The design of Split-KalmanNet confers several critical advantages:
- Decoupling Error Sources: By splitting the gain computation, the estimator isolates process model uncertainty from measurement model uncertainty. This modular handling is particularly effective under heterogeneous or time-varying noise scenarios.
- Preservation of Geometric Structure: The use of the analytically computed Jacobian 7 maintains the correct geometric dependence between state and measurement, anchoring the data-driven corrections in physically meaningful space.
- Robust Bias-Variance Behavior: Empirical evidence demonstrates bias reduction and improved generalization compared to both traditional EKF and KalmanNet, especially at high levels of process or measurement noise mismatch.
- Theoretical Rationale: The statistical independence of process and measurement covariances is reflected in the split architecture, aligning with the divide-and-conquer principle in robust filtering.
Collectively, Split-KalmanNet provides a computationally efficient, model-guided deep learning solution for online SLAM, capable of robust operation across diverse and challenging regimes characterized by severe model mismatch (Choi et al., 2022).