RTS Smoother: Optimal State Estimation
- RTS smoother is a two-pass fixed-interval state estimation algorithm that refines state estimates using both past and future observations.
- It achieves MMSE optimality under linear–Gaussian assumptions and extends to non-Gaussian, nonlinear, and manifold-valued systems.
- The algorithm underpins offline trajectory estimation across diverse fields, offering robust performance against noise, outliers, and cyber-attacks.
The Rauch–Tung–Striebel (RTS) smoother is a two-pass fixed-interval optimal state estimation algorithm for linear and certain nonlinear state-space models, providing the minimum mean-square error (MMSE) estimate of a hidden Markov process given all available observations over a finite horizon. In the classical setting, the RTS smoother executes a forward filtering recursion (typically the Kalman or extended Kalman filter), followed by a backward recursion that refines the state estimates using future measurements. It achieves efficiency—attaining the Cramér–Rao bound—under linear–Gaussian assumptions. Extensions of the RTS smoother have been developed for systems with manifold-valued states, measurement outliers, cyber-attacked observations, non-Gaussian noise, and via learned deep architectures. The algorithm forms the backbone of offline trajectory estimation in a broad array of applied and theoretical domains.
1. State-Space Model Formulations
The canonical RTS smoother applies to linear, discrete-time, time-varying state-space models: with initial state and known parameters. Nonlinear models,
require linearization (EKF, UKF, or Gaussian-integral approaches).
Continuous-time versions are posed via stochastic differential equations (SDE): where , are standard Brownian motions. The smoother is formulated for trajectory estimation over .
2. RTS Smoothing Recursion
Given filtered (posterior) means and covariances , and predicted (prior) quantities from the forward pass, the RTS backward recursion for is: Initialized at final time by , (Weng et al., 2023, Surya, 2023).
Extensions for nonlinear models generalize this recursion by computing conditional moments via, e.g., Gaussian integrals for polynomial nonlinearities (Singh et al., 12 Jan 2025), sigma-point methods (Majal et al., 2024), or learning the gains via deep networks (Revach et al., 2021).
Continuous-time analogues employ ODEs for the smoothed conditional mean and covariance , propagated backward from terminal conditions (see (Kurisaki, 5 Jan 2026, Razavi et al., 15 Dec 2025)): with and .
3. Statistical, Optimization, and Information-Theoretic Perspectives
The RTS smoother maximizes the joint smoothing likelihood, or equivalently solves the score equation for incomplete data (Surya, 2023): leading to a Newton–Raphson update whose closed-form solution matches the classical RTS formulas and whose error covariance attains the Cramér–Rao lower bound.
From an optimization perspective, the linear Gaussian smoothing problem is an unconstrained minimum-energy or MAP quadratic program, whose KKT system is block-tridiagonal (Aravkin et al., 2013, Barfoot et al., 2019): Block-tridiagonal matrix sweeps (Thomas algorithm) are algebraically equivalent to the RTS two-pass procedure.
4. Extensions and Robustness
Nonlinear and Non-Gaussian Models
Polynomial expectation evaluations via the “Gaussian-integral RTS smoother” (GIRTSS) outperform standard sigma-point methods when the underlying model is polynomial (Singh et al., 12 Jan 2025). The MEE-RTS smoother uses minimum error entropy rather than MMSE, improving robustness to heavy-tailed noise by maximizing Renyi’s quadratic entropy of the errors (He et al., 2023).
Outlier, Corrupted, and Attacked Measurements
The ASOR-URTSS, EMORF/S, and cyber-attack-aware variants modify the forward pass to adapt measurement covariance for selective outlier rejection or cyber-injected noise, then apply the unchanged RTS backward recursion using the adapted covariances (Majal et al., 2024, Chughtai et al., 2023, Kumar et al., 11 Apr 2025). EMORF/S embeds the RTS smoother within an EM framework treating binary outlier indicators as latent and recalibrating the measurement noise during each EM iteration, ultimately providing theoretical bounds via the Bayesian Cramér–Rao Bound.
Lie Group and Manifold-Valued States
The invariant RTS (IRTS) and Lie-group implementations generalize RTS to matrix Lie groups (SE(3), SE₂(3) × T(6)), essential for robotics, navigation, and pose estimation, allowing state-independent Jacobians for improved linearization consistency and robust smoothing over manifolds (Laan et al., 2024, Fernandes et al., 2022).
Learning-Based Smoothers
RTSNet integrates trainable RNN modules into the gain computation steps of the classical RTS flow. The network is unfolded for multiple forward–backward passes, learning to adapt the gain structure for model-mismatch and nonlinearity, with demonstrated performance improvements over classic smoothers (Revach et al., 2021).
5. Continuous-Time, Pathwise, and Quantum Generalizations
The continuous-time RTS smoother uses ODE-based forward Kalman–Bucy filtering and backward smoothing equations, including optimal control (Onsager–Machlup) and pathwise formulations: with a backward Ornstein–Uhlenbeck error process providing explicit pathwise Monte Carlo sampling (Kurisaki, 5 Jan 2026).
Quantum generalizations require modified RTS forms to ensure the smoothed state remains a physically valid quantum Gaussian state, as classical formulas may violate quantum constraints (Laverick, 2020, Roy et al., 2013).
6. Applications and Empirical Performance
The RTS smoother has critical impact in high-noise navigation scenarios, including GNSS localization with smartphone-grade sensors: compared to weighted least squares (WLS) and pure filtering, RTS smoothing yields up to 76.4% reduction in horizontal positioning error in static environments and ~46.5% in dynamic tests (Weng et al., 2023).
Robust smoothers like ASOR-URTSS demonstrate bounded posterior influence under severe outlier contamination, with formal guarantees via KL divergence criteria (Majal et al., 2024). Lie-group and IRTS smoothers offer substantial improvements in drone navigation, pose estimation, and SLAM, reducing error variances substantially relative to Euler, quaternion, or batch optimization baselines (Fernandes et al., 2022, Laan et al., 2024).
A summary comparison of empirical and structural attributes:
| Smoother Type | Handles Nonlinearity | Outlier/Cyber Robustness | Manifold/Group States | Achieves CRLB* |
|---|---|---|---|---|
| Classical RTS | Linear/EKF/UKF | No/Partial (UKF/ETSS) | No | Yes |
| GIRTSS | Polynomial Nonlinear | No | No | Yes (when linear/poly) |
| MEE-RTS | Linear/EKF extension | Yes | No | No (min-entropy) |
| EMORF/S, ASOR-URTSS | Nonlinear/EKF/UKF | Yes | No | Yes (under ideal rejector) |
| IRTS, Lie-RTS | Yes (Geometric) | Application-specific | Yes | Yes (under linearity/invariance) |
| RTSNet | Yes (Learned) | Yes (data-driven) | Possible | Data-dependent |
*CRLB= Cramér–Rao lower bound (minimum variance unbiased estimator).
7. Algorithmic and Numerical Stability
Block-tridiagonal algebraic perspectives rigorously establish the numerical stability of the RTS smoother, provided the system matrices are uniformly well-conditioned. Alternative backward (Mayne “M” smoother) and two-filter (Mayne–Fraser “MF” smoother) variants offer distinct stability profiles, with M generally more robust to Hessian conditioning, and MF offering parallel sweep advantages but potential endpoint failures. A hybrid RTS/M parallel approach trades minimal communication for guaranteed convergence in distributed settings (Aravkin et al., 2013, Barfoot et al., 2019).
GPU and parallel architectures support temporal parallelization of continuous-time RTS smoothing via associative scan algorithms; this maintains accuracy and yields significant wall-clock speedup over sequential integration for both linear and nonlinear state-space models (Razavi et al., 15 Dec 2025).
8. Theoretical Efficiency and Information Bounds
The classical RTS smoother is not only MMSE-optimal for linear Gaussian systems but achieves equality in the Cramér–Rao bound for estimation error covariance (Surya, 2023). Its covariance matches the inverse expected Fisher information, implying full statistical efficiency.
Batch variational inference (ESGVI) recovers the exact RTS smoother for linear models through one-step solution of the posterior information system, with identical computational complexity and storage cost (Barfoot et al., 2019).
References
Principal sources for this article include (Weng et al., 2023, Surya, 2023, Aravkin et al., 2013, Singh et al., 12 Jan 2025, Majal et al., 2024, Roy et al., 2013, Laan et al., 2024, Fernandes et al., 2022, Revach et al., 2021, Razavi et al., 15 Dec 2025, Barfoot et al., 2019, Chughtai et al., 2023, Kumar et al., 11 Apr 2025, Kurisaki, 5 Jan 2026, He et al., 2023, Laverick, 2020). Each provides rigorous derivations, implementation guidelines, and application-specific empirical results.