Split-KalmanNet for Robust SLAM

Updated 31 May 2026

Split-KalmanNet is a model-based deep learning framework that integrates the EKF with dual RNN modules to enhance SLAM performance under significant model mismatch.
It strategically decouples process and measurement uncertainties by employing separate RNNs, preserving geometric consistency through analytical Jacobians.
Empirical evaluations demonstrate that Split-KalmanNet achieves near-MMSE performance across varied noise regimes, outperforming both traditional EKF and KalmanNet.

Split-KalmanNet is a robust model-based deep learning framework designed to address simultaneous localization and mapping (SLAM) tasks in the presence of significant model mismatch. Building on the classical extended Kalman filter (EKF) and recent hybrid approaches such as KalmanNet, Split-KalmanNet strategically integrates the EKF’s structural properties with two small recurrent neural networks (RNNs) to achieve state-of-the-art SLAM performance when system and measurement models are imprecisely known. The split architecture enables distinct compensation for process and measurement model uncertainties, allowing the SLAM estimator to maintain accuracy and stability across a wide range of nonideal regimes (Choi et al., 2022).

1. State-Space Model and EKF Limitations

SLAM is conventionally posed as a discrete-time nonlinear state-space estimation problem, in which the full state vector $x_t \in \mathbb{R}^n$ comprises both the robot pose and all landmark coordinates at time $t$ , and observations $y_t \in \mathbb{R}^m$ represent measurements such as range and bearing. The model takes the form:

State evolution: $x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$
Measurement: $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$

where $f$ and $h$ are the nonlinear motion and measurement functions, $u_t$ is the control input, and $Q_t$ , $R_t$ are the true process and measurement covariance matrices.

The extended Kalman filter performs recursive Bayesian estimation using linearizations:

Prediction: $t$ 0 and $t$ 1, with $t$ 2
Update: Calculate $t$ 3, $t$ 4, Kalman gain $t$ 5, then update $t$ 6 and $t$ 7.

EKF critically relies on accurate $t$ 8, $t$ 9, and correct linearization of $y_t \in \mathbb{R}^m$ 0 and $y_t \in \mathbb{R}^m$ 1. In real-world deployments, these are often poorly specified or nonstationary. Mismatched covariances or Jacobians cause severe degradation or divergence due to miscomputed Kalman gains.

2. Split-KalmanNet Architecture

Split-KalmanNet preserves the EKF’s two-step recursion but fundamentally changes the Kalman gain computation. Rather than analytically deriving the gain from assumed covariances, the method uses two separate lightweight RNNs to learn corrections for process (prior) and innovation covariances. The update steps are:

Prediction: $y_t \in \mathbb{R}^m$ 2 (standard EKF).
Jacobian Calculation: $y_t \in \mathbb{R}^m$ 3.
Innovation Calculation: $y_t \in \mathbb{R}^m$ 4.
Split Gain Assembly: $y_t \in \mathbb{R}^m$ 5, where $y_t \in \mathbb{R}^m$ 6, $y_t \in \mathbb{R}^m$ 7 are outputs of RNN #1 and RNN #2, respectively.
State Update: $y_t \in \mathbb{R}^m$ 8.

Here, $y_t \in \mathbb{R}^m$ 9 approximates or corrects the prior covariance, and $x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 0 estimates the inverse innovation covariance, letting each RNN focus on a distinct uncertainty source.

3. Kalman Gain: Classic vs. Learned Split Formulation

The standard EKF computes

$x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 1
$x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 2
$x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 3

Split-KalmanNet replaces $x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 4 and $x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 5 with neural approximators:

RNN #1 ingests EKF-derived features to model (or correct) $x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 6, outputting a matrix $x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 7.
RNN #2 produces $x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 8, approximating $x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)$ 9 based on innovation statistics.

The split Kalman gain becomes $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 0, explicitly maintaining the geometric structure induced by $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 1 while flexibly adapting to errors in process and measurement modeling.

4. RNN Modules and Input Feature Design

Both RNNs operate as compact gated recurrent modules (e.g., GRU or LSTM), each with a specialized input pipeline:

RNN #1 (Prior-Covariance Learner):
- Inputs: $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 2, $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 3, and optionally the difference $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 4 to capture linearization errors.
- Outputs: Matrix $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 5
RNN #2 (Innovation-Covariance Learner):
- Inputs: $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 6, $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 7, with optional inclusion of $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 8.
- Outputs: Matrix $y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)$ 9

RNN #1 absorbs temporal patterns in prediction errors, targeting process noise and model mismatch in $f$ 0, $f$ 1. RNN #2 specializes in learning from actual and predicted innovations, compensating for discrepancies in $f$ 2, $f$ 3.

5. Training Protocol and Loss Function

Training encompasses both synthetic and complex SLAM datasets:

Uniform Circular-Motion Benchmark: 2D agent in circular motion, with linear or polar measurements.
SLAM Dataset: 5 landmarks randomly placed, control sequences drawn uniformly, trajectories for training ( $f$ 4, $f$ 5) and for testing ( $f$ 6, $f$ 7) with mismatched process and measurement noise levels.

Noises ( $f$ 8) are varied across orders of magnitude to foster robust learning. The end-to-end mean-squared error loss is

$f$ 9

Optimization is performed via alternating stochastic gradient descent:

Fix $h$ 0 and update $h$ 1 using gradients through $h$ 2.
Fix $h$ 3 and update $h$ 4 via $h$ 5.
Alternate until convergence, with early stopping via validation MSE. No explicit regularization is applied.

6. Empirical Performance

Extensive simulation results are reported for both canonical and SLAM-specific tasks:

Linear Measurement (Circular Motion): For growing noise-variance ratio $h$ 6, classical EKF (with access to ground-truth models) remains at MMSE performance, while KalmanNet shows rapid degradation beyond $h$ 7. In contrast, Split-KalmanNet achieves MMSE up to $h$ 8.
Nonlinear Measurement: Split-KalmanNet slightly outperforms EKF when noise is small (exploiting linearization error) and remains robust for large $h$ 9, where KalmanNet fails.
Online SLAM (MSE, dB): Across $u_t$ 0, Split-KalmanNet matches perfect EKF at $u_t$ 1. EKF with mismatched models is $u_t$ 2 worse; KalmanNet is $u_t$ 3 worse. For heterogeneity in range vs. bearing noise ( $u_t$ 4), Split-KalmanNet consistently aligns with perfect EKF, outperforming mismatched EKF by $u_t$ 5, and KalmanNet by $u_t$ 6.

7. Impact and Structural Advantages

The design of Split-KalmanNet confers several critical advantages:

Decoupling Error Sources: By splitting the gain computation, the estimator isolates process model uncertainty from measurement model uncertainty. This modular handling is particularly effective under heterogeneous or time-varying noise scenarios.
Preservation of Geometric Structure: The use of the analytically computed Jacobian $u_t$ 7 maintains the correct geometric dependence between state and measurement, anchoring the data-driven corrections in physically meaningful space.
Robust Bias-Variance Behavior: Empirical evidence demonstrates bias reduction and improved generalization compared to both traditional EKF and KalmanNet, especially at high levels of process or measurement noise mismatch.
Theoretical Rationale: The statistical independence of process and measurement covariances is reflected in the split architecture, aligning with the divide-and-conquer principle in robust filtering.

Collectively, Split-KalmanNet provides a computationally efficient, model-guided deep learning solution for online SLAM, capable of robust operation across diverse and challenging regimes characterized by severe model mismatch (Choi et al., 2022).

Markdown Report Issue Upgrade to Chat

References (1)

Split-KalmanNet: A Robust Model-Based Deep Learning Approach for SLAM (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Split-KalmanNet.

Split-KalmanNet for Robust SLAM

1. State-Space Model and EKF Limitations

2. Split-KalmanNet Architecture

3. Kalman Gain: Classic vs. Learned Split Formulation

4. RNN Modules and Input Feature Design

5. Training Protocol and Loss Function

6. Empirical Performance

7. Impact and Structural Advantages

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Split-KalmanNet for Robust SLAM

1. State-Space Model and EKF Limitations

2. Split-KalmanNet Architecture

3. Kalman Gain: Classic vs. Learned Split Formulation

4. RNN Modules and Input Feature Design

5. Training Protocol and Loss Function

6. Empirical Performance

7. Impact and Structural Advantages

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research