RTS Smoother: Optimal State Estimation

Updated 9 January 2026

RTS smoother is a two-pass fixed-interval state estimation algorithm that refines state estimates using both past and future observations.
It achieves MMSE optimality under linear–Gaussian assumptions and extends to non-Gaussian, nonlinear, and manifold-valued systems.
The algorithm underpins offline trajectory estimation across diverse fields, offering robust performance against noise, outliers, and cyber-attacks.

The Rauch–Tung–Striebel (RTS) smoother is a two-pass fixed-interval optimal state estimation algorithm for linear and certain nonlinear state-space models, providing the minimum mean-square error (MMSE) estimate of a hidden Markov process given all available observations over a finite horizon. In the classical setting, the RTS smoother executes a forward filtering recursion (typically the Kalman or extended Kalman filter), followed by a backward recursion that refines the state estimates using future measurements. It achieves efficiency—attaining the Cramér–Rao bound—under linear–Gaussian assumptions. Extensions of the RTS smoother have been developed for systems with manifold-valued states, measurement outliers, cyber-attacked observations, non-Gaussian noise, and via learned deep architectures. The algorithm forms the backbone of offline trajectory estimation in a broad array of applied and theoretical domains.

1. State-Space Model Formulations

The canonical RTS smoother applies to linear, discrete-time, time-varying state-space models: $\begin{aligned} x_k &= F_k\,x_{k-1} + G_k\,u_k + v_k,\quad v_k\sim\mathcal{N}(0,Q_k) \ y_k &= H_k\,x_k + w_k,\quad w_k\sim\mathcal{N}(0,R_k) \end{aligned}$ with initial state $x_0\sim\mathcal{N}(\mu,P_0)$ and known parameters. Nonlinear models,

$x_k = f(x_{k-1}) + v_{k-1}, \qquad y_k = h(x_k) + w_k$

require linearization (EKF, UKF, or Gaussian-integral approaches).

Continuous-time versions are posed via stochastic differential equations (SDE): $dx_t = A(t)\,x_t\,dt + B(t)\,dV_t, \qquad dY_t = C(t)\,x_t\,dt + \Sigma(t)\,dW_t$ where $V_t$ , $W_t$ are standard Brownian motions. The smoother is formulated for trajectory estimation over $[0,T]$ .

2. RTS Smoothing Recursion

Given filtered (posterior) means and covariances $\{\hat{x}_k, P_k\}$ , and predicted (prior) quantities $\{\hat{x}_k^-, P_k^-\}$ from the forward pass, the RTS backward recursion for $k=N-1,\ldots,0$ is: $\begin{aligned} S_k &= P_k\,F_{k+1}^\top (P_{k+1}^-)^{-1} \ \hat{x}_{k|N} &= \hat{x}_k + S_k(\hat{x}_{k+1|N} - \hat{x}_{k+1}^-) \ P_{k|N} &= P_k + S_k(P_{k+1|N} - P_{k+1}^-)\,S_k^\top \end{aligned}$ Initialized at final time $N$ by $\hat{x}_{N|N} = \hat{x}_N$ , $P_{N|N} = P_N$ (Weng et al., 2023, Surya, 2023).

Extensions for nonlinear models generalize this recursion by computing conditional moments via, e.g., Gaussian integrals for polynomial nonlinearities (Singh et al., 12 Jan 2025), sigma-point methods (Majal et al., 2024), or learning the gains via deep networks (Revach et al., 2021).

Continuous-time analogues employ ODEs for the smoothed conditional mean $\mu_{s|T}$ and covariance $P_{s|T}$ , propagated backward from terminal conditions (see (Kurisaki, 5 Jan 2026, Razavi et al., 15 Dec 2025)): $\frac{d}{ds}\,\mu_{s|T} = -[A(s) + J(s)]\,\mu_{s|T} + J(s)\,\mu_s$ with $J(s) = Q(s) P_s^{-1}$ and $\mu_{T|T} = \mu_T$ .

3. Statistical, Optimization, and Information-Theoretic Perspectives

The RTS smoother maximizes the joint smoothing likelihood, or equivalently solves the score equation for incomplete data (Surya, 2023): $0 = \frac{\partial}{\partial x_k} \log f(x_k, x_{k+1|n}^s, y_{0:n}|\theta)$ leading to a Newton–Raphson update whose closed-form solution matches the classical RTS formulas and whose error covariance attains the Cramér–Rao lower bound.

From an optimization perspective, the linear Gaussian smoothing problem is an unconstrained minimum-energy or MAP quadratic program, whose KKT system is block-tridiagonal (Aravkin et al., 2013, Barfoot et al., 2019): $\min_{\{x_k\}} \sum_{k} \frac12 \|y_k - H_k x_k\|^2_{R_k^{-1}} + \frac12 \|x_k - F_k x_{k-1}\|^2_{Q_k^{-1}}$ Block-tridiagonal matrix sweeps (Thomas algorithm) are algebraically equivalent to the RTS two-pass procedure.

4. Extensions and Robustness

Nonlinear and Non-Gaussian Models

Polynomial expectation evaluations via the “Gaussian-integral RTS smoother” (GIRTSS) outperform standard sigma-point methods when the underlying model is polynomial (Singh et al., 12 Jan 2025). The MEE-RTS smoother uses minimum error entropy rather than MMSE, improving robustness to heavy-tailed noise by maximizing Renyi’s quadratic entropy of the errors (He et al., 2023).

Outlier, Corrupted, and Attacked Measurements

The ASOR-URTSS, EMORF/S, and cyber-attack-aware variants modify the forward pass to adapt measurement covariance for selective outlier rejection or cyber-injected noise, then apply the unchanged RTS backward recursion using the adapted covariances (Majal et al., 2024, Chughtai et al., 2023, Kumar et al., 11 Apr 2025). EMORF/S embeds the RTS smoother within an EM framework treating binary outlier indicators as latent and recalibrating the measurement noise during each EM iteration, ultimately providing theoretical bounds via the Bayesian Cramér–Rao Bound.

Lie Group and Manifold-Valued States

The invariant RTS (IRTS) and Lie-group implementations generalize RTS to matrix Lie groups (SE(3), SE₂(3) × T(6)), essential for robotics, navigation, and pose estimation, allowing state-independent Jacobians for improved linearization consistency and robust smoothing over manifolds (Laan et al., 2024, Fernandes et al., 2022).

Learning-Based Smoothers

RTSNet integrates trainable RNN modules into the gain computation steps of the classical RTS flow. The network is unfolded for multiple forward–backward passes, learning to adapt the gain structure for model-mismatch and nonlinearity, with demonstrated performance improvements over classic smoothers (Revach et al., 2021).

5. Continuous-Time, Pathwise, and Quantum Generalizations

The continuous-time RTS smoother uses ODE-based forward Kalman–Bucy filtering and backward smoothing equations, including optimal control (Onsager–Machlup) and pathwise formulations: $\dot{x}^*(t) = F(t)x^*(t) + c(t) + Q(t)P^{-1}(t)[x^*(t) - m(t)]$ with a backward Ornstein–Uhlenbeck error process providing explicit pathwise Monte Carlo sampling (Kurisaki, 5 Jan 2026).

Quantum generalizations require modified RTS forms to ensure the smoothed state remains a physically valid quantum Gaussian state, as classical formulas may violate quantum constraints (Laverick, 2020, Roy et al., 2013).

6. Applications and Empirical Performance

The RTS smoother has critical impact in high-noise navigation scenarios, including GNSS localization with smartphone-grade sensors: compared to weighted least squares (WLS) and pure filtering, RTS smoothing yields up to 76.4% reduction in horizontal positioning error in static environments and ~46.5% in dynamic tests (Weng et al., 2023).

Robust smoothers like ASOR-URTSS demonstrate bounded posterior influence under severe outlier contamination, with formal guarantees via KL divergence criteria (Majal et al., 2024). Lie-group and IRTS smoothers offer substantial improvements in drone navigation, pose estimation, and SLAM, reducing error variances substantially relative to Euler, quaternion, or batch optimization baselines (Fernandes et al., 2022, Laan et al., 2024).

A summary comparison of empirical and structural attributes:

Smoother Type	Handles Nonlinearity	Outlier/Cyber Robustness	Manifold/Group States	Achieves CRLB*
Classical RTS	Linear/EKF/UKF	No/Partial (UKF/ETSS)	No	Yes
GIRTSS	Polynomial Nonlinear	No	No	Yes (when linear/poly)
MEE-RTS	Linear/EKF extension	Yes	No	No (min-entropy)
EMORF/S, ASOR-URTSS	Nonlinear/EKF/UKF	Yes	No	Yes (under ideal rejector)
IRTS, Lie-RTS	Yes (Geometric)	Application-specific	Yes	Yes (under linearity/invariance)
RTSNet	Yes (Learned)	Yes (data-driven)	Possible	Data-dependent

*CRLB= Cramér–Rao lower bound (minimum variance unbiased estimator).

7. Algorithmic and Numerical Stability

Block-tridiagonal algebraic perspectives rigorously establish the numerical stability of the RTS smoother, provided the system matrices are uniformly well-conditioned. Alternative backward (Mayne “M” smoother) and two-filter (Mayne–Fraser “MF” smoother) variants offer distinct stability profiles, with M generally more robust to Hessian conditioning, and MF offering parallel sweep advantages but potential endpoint failures. A hybrid RTS/M parallel approach trades minimal communication for guaranteed convergence in distributed settings (Aravkin et al., 2013, Barfoot et al., 2019).

GPU and parallel architectures support temporal parallelization of continuous-time RTS smoothing via associative scan algorithms; this maintains accuracy and yields significant wall-clock speedup over sequential integration for both linear and nonlinear state-space models (Razavi et al., 15 Dec 2025).

8. Theoretical Efficiency and Information Bounds

The classical RTS smoother is not only MMSE-optimal for linear Gaussian systems but achieves equality in the Cramér–Rao bound for estimation error covariance (Surya, 2023). Its covariance matches the inverse expected Fisher information, implying full statistical efficiency.

Batch variational inference (ESGVI) recovers the exact RTS smoother for linear models through one-step solution of the posterior information system, with identical computational complexity and storage cost (Barfoot et al., 2019).