Papers
Topics
Authors
Recent
Search
2000 character limit reached

Split-KalmanNet for Robust SLAM

Updated 31 May 2026
  • Split-KalmanNet is a model-based deep learning framework that integrates the EKF with dual RNN modules to enhance SLAM performance under significant model mismatch.
  • It strategically decouples process and measurement uncertainties by employing separate RNNs, preserving geometric consistency through analytical Jacobians.
  • Empirical evaluations demonstrate that Split-KalmanNet achieves near-MMSE performance across varied noise regimes, outperforming both traditional EKF and KalmanNet.

Split-KalmanNet is a robust model-based deep learning framework designed to address simultaneous localization and mapping (SLAM) tasks in the presence of significant model mismatch. Building on the classical extended Kalman filter (EKF) and recent hybrid approaches such as KalmanNet, Split-KalmanNet strategically integrates the EKF’s structural properties with two small recurrent neural networks (RNNs) to achieve state-of-the-art SLAM performance when system and measurement models are imprecisely known. The split architecture enables distinct compensation for process and measurement model uncertainties, allowing the SLAM estimator to maintain accuracy and stability across a wide range of nonideal regimes (Choi et al., 2022).

1. State-Space Model and EKF Limitations

SLAM is conventionally posed as a discrete-time nonlinear state-space estimation problem, in which the full state vector xtRnx_t \in \mathbb{R}^n comprises both the robot pose and all landmark coordinates at time tt, and observations ytRmy_t \in \mathbb{R}^m represent measurements such as range and bearing. The model takes the form:

  • State evolution: xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)
  • Measurement: yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)

where ff and hh are the nonlinear motion and measurement functions, utu_t is the control input, and QtQ_t, RtR_t are the true process and measurement covariance matrices.

The extended Kalman filter performs recursive Bayesian estimation using linearizations:

  • Prediction: tt0 and tt1, with tt2
  • Update: Calculate tt3, tt4, Kalman gain tt5, then update tt6 and tt7.

EKF critically relies on accurate tt8, tt9, and correct linearization of ytRmy_t \in \mathbb{R}^m0 and ytRmy_t \in \mathbb{R}^m1. In real-world deployments, these are often poorly specified or nonstationary. Mismatched covariances or Jacobians cause severe degradation or divergence due to miscomputed Kalman gains.

2. Split-KalmanNet Architecture

Split-KalmanNet preserves the EKF’s two-step recursion but fundamentally changes the Kalman gain computation. Rather than analytically deriving the gain from assumed covariances, the method uses two separate lightweight RNNs to learn corrections for process (prior) and innovation covariances. The update steps are:

  • Prediction: ytRmy_t \in \mathbb{R}^m2 (standard EKF).
  • Jacobian Calculation: ytRmy_t \in \mathbb{R}^m3.
  • Innovation Calculation: ytRmy_t \in \mathbb{R}^m4.
  • Split Gain Assembly: ytRmy_t \in \mathbb{R}^m5, where ytRmy_t \in \mathbb{R}^m6, ytRmy_t \in \mathbb{R}^m7 are outputs of RNN #1 and RNN #2, respectively.
  • State Update: ytRmy_t \in \mathbb{R}^m8.

Here, ytRmy_t \in \mathbb{R}^m9 approximates or corrects the prior covariance, and xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)0 estimates the inverse innovation covariance, letting each RNN focus on a distinct uncertainty source.

3. Kalman Gain: Classic vs. Learned Split Formulation

The standard EKF computes

  • xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)1
  • xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)2
  • xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)3

Split-KalmanNet replaces xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)4 and xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)5 with neural approximators:

  • RNN #1 ingests EKF-derived features to model (or correct) xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)6, outputting a matrix xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)7.
  • RNN #2 produces xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)8, approximating xt=f(xt1,ut)+wt,wtN(0,Qt)x_t = f(x_{t-1}, u_t) + w_t,\quad w_t \sim \mathcal{N}(0, Q_t)9 based on innovation statistics.

The split Kalman gain becomes yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)0, explicitly maintaining the geometric structure induced by yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)1 while flexibly adapting to errors in process and measurement modeling.

4. RNN Modules and Input Feature Design

Both RNNs operate as compact gated recurrent modules (e.g., GRU or LSTM), each with a specialized input pipeline:

  • RNN #1 (Prior-Covariance Learner):
    • Inputs: yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)2, yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)3, and optionally the difference yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)4 to capture linearization errors.
    • Outputs: Matrix yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)5
  • RNN #2 (Innovation-Covariance Learner):
    • Inputs: yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)6, yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)7, with optional inclusion of yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)8.
    • Outputs: Matrix yt=h(xt)+vt,vtN(0,Rt)y_t = h(x_t) + v_t,\quad v_t \sim \mathcal{N}(0, R_t)9

RNN #1 absorbs temporal patterns in prediction errors, targeting process noise and model mismatch in ff0, ff1. RNN #2 specializes in learning from actual and predicted innovations, compensating for discrepancies in ff2, ff3.

5. Training Protocol and Loss Function

Training encompasses both synthetic and complex SLAM datasets:

  • Uniform Circular-Motion Benchmark: 2D agent in circular motion, with linear or polar measurements.
  • SLAM Dataset: 5 landmarks randomly placed, control sequences drawn uniformly, trajectories for training (ff4, ff5) and for testing (ff6, ff7) with mismatched process and measurement noise levels.

Noises (ff8) are varied across orders of magnitude to foster robust learning. The end-to-end mean-squared error loss is

ff9

Optimization is performed via alternating stochastic gradient descent:

  • Fix hh0 and update hh1 using gradients through hh2.
  • Fix hh3 and update hh4 via hh5.
  • Alternate until convergence, with early stopping via validation MSE. No explicit regularization is applied.

6. Empirical Performance

Extensive simulation results are reported for both canonical and SLAM-specific tasks:

  • Linear Measurement (Circular Motion): For growing noise-variance ratio hh6, classical EKF (with access to ground-truth models) remains at MMSE performance, while KalmanNet shows rapid degradation beyond hh7. In contrast, Split-KalmanNet achieves MMSE up to hh8.
  • Nonlinear Measurement: Split-KalmanNet slightly outperforms EKF when noise is small (exploiting linearization error) and remains robust for large hh9, where KalmanNet fails.
  • Online SLAM (MSE, dB): Across utu_t0, Split-KalmanNet matches perfect EKF at utu_t1. EKF with mismatched models is utu_t2 worse; KalmanNet is utu_t3 worse. For heterogeneity in range vs. bearing noise (utu_t4), Split-KalmanNet consistently aligns with perfect EKF, outperforming mismatched EKF by utu_t5, and KalmanNet by utu_t6.

7. Impact and Structural Advantages

The design of Split-KalmanNet confers several critical advantages:

  • Decoupling Error Sources: By splitting the gain computation, the estimator isolates process model uncertainty from measurement model uncertainty. This modular handling is particularly effective under heterogeneous or time-varying noise scenarios.
  • Preservation of Geometric Structure: The use of the analytically computed Jacobian utu_t7 maintains the correct geometric dependence between state and measurement, anchoring the data-driven corrections in physically meaningful space.
  • Robust Bias-Variance Behavior: Empirical evidence demonstrates bias reduction and improved generalization compared to both traditional EKF and KalmanNet, especially at high levels of process or measurement noise mismatch.
  • Theoretical Rationale: The statistical independence of process and measurement covariances is reflected in the split architecture, aligning with the divide-and-conquer principle in robust filtering.

Collectively, Split-KalmanNet provides a computationally efficient, model-guided deep learning solution for online SLAM, capable of robust operation across diverse and challenging regimes characterized by severe model mismatch (Choi et al., 2022).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Split-KalmanNet.