Sliding-Window Factor Graph Optimization

Updated 18 December 2025

Sliding-window factor graph optimization is a state estimation paradigm that structures recent system states and sensor measurements within a fixed temporal window.
It integrates diverse sensor factors—such as IMU, GNSS, ultrasonic, and scan registration—with robust covariance and marginalization techniques to ensure computational tractability.
Its sliding window operation supports real-time applications in localization, tracking, and sensor fusion by balancing memory constraints with continuous, consistent state updates.

Sliding-window factor graph optimization (SW-FGO) is a state estimation and data association paradigm that formulates recent system states and measurements within a “window” as a factor graph, incrementally updates the graph as new data arrive, and marginalizes out old states using Schur complement to preserve sparsity and computational tractability. This technique provides bounded-memory nonlinear optimization, supporting real-time application in localization, tracking, and sensor fusion under challenging conditions where batch optimization is impractical and recursive filters are suboptimal.

1. Architecture and Problem Formulation

SW-FGO represents the unknown states—such as system poses, velocities, biases, or decision variables—over a fixed recent temporal horizon. At each timestep, states are added at the leading edge of the window, connected by sensor-derived constraint factors; on exceeding the maximum window length, the oldest states are marginalized, with their statistical influence retained in a condensed prior.

For example, in high-precision indoor positioning, the system state at each discrete instant $i$ comprises pose $T_i \in SE(3)$ , velocity $v_i$ , and IMU biases $b^a_i, b^g_i$ , assembled over a window of length $W$ as

$X = \{T_{k-W+1}, v_{k-W+1}, b^a_{k-W+1}, b^g_{k-W+1}, \ldots, T_k, v_k, b^a_k, b^g_k\}$

with measurements encoded as factors imposing residual errors between predicted and observed sensor outputs (Zhang et al., 17 Mar 2025).

A parallel structure arises in GNSS-inertial navigation, with receiver ECEF position $p_k$ , velocity $v_k$ , and clock bias $dt_k$ at epoch $k$ :

$x_k = [p_k^T, v_k^T, c\,dt_k]^T$

and the active state set $X = \{x_{K-N+1}, \ldots, x_K\}$ over the window of $N$ epochs (Bai et al., 2021).

2. Factor Types and Sensor Fusion

SW-FGO is distinguished by the diversity and structure of factor types, each encoding a physical or logical observation model with compatible residual and covariance:

IMU pre-integration factors: Capture continuous-time inertial motion constraints between successive states using pre-integrated IMU measurements corrected for bias and covariance, following the Forster et al. formulation. The residual combines rotational, velocity, and position terms, enabling trajectory smoothness and scale estimation (Zhang et al., 17 Mar 2025, Koide et al., 8 Feb 2024).
Range and ToA/TDoA factors: Model geometric distance constraints from time-of-arrival sensors, e.g., UWB anchors or GNSS pseudoranges, typically as nonlinear scalar residuals with robust kernels (Huber/Cauchy) for outlier mitigation. In some frameworks, TDoA factors utilize anchor-pair range differences (Zhang et al., 17 Mar 2025, Bai et al., 2021).
Ultrasonic and elevation factors: For constrained environments, scalar or vector factors encode physical ranges and vertical priors, often stacking elevation and planar range terms to exploit ultrasonic or altimetric readings (Zhang et al., 17 Mar 2025).
Scan-to-scan and scan-to-map registration: For range-inertial SLAM, GICP-type cost functions register rolling point cloud scans against prior maps or each other via distribution-to-distribution error, incorporating robust covariance handling and analytic Jacobians (Koide et al., 8 Feb 2024).
Windowed carrier-phase (WCP) and Doppler constraints: In positioning, time-correlated carrier-phase factors connect multiple GNSS states to exploit inter-epoch consistency and enhance robustness in ambiguous or multipath-rich regimes. Null-space projection removes phase ambiguities (Bai et al., 2021).
Track association variables and exclusion constraints: In tracking (e.g., SWTrack), hypothesis variables $z_h \in \{0,1\}$ represent association decisions for detected object paths through the frame window, with hard exclusion factors ensuring unique assignment per detection and lifted skip-edges addressing missed observations (Papais et al., 27 Feb 2024).

3. Sliding Window Operation and Marginalization

The sliding mechanism enables online operation with bounded computational resources. The process comprises:

Initialization: Window is filled with initial state estimates using physical priors or coarse solutions (e.g., ultrasonic-enhanced localization, gravity-aligned poses in 3D maps) (Zhang et al., 17 Mar 2025, Koide et al., 8 Feb 2024).
State addition and factor insertion: At each new timestep, a fresh state and all associated sensor factors are appended to the window’s trailing edge.
Marginalization: When the window exceeds maximum length, the oldest node is removed. The statistical influence is encoded as a prior factor derived via the Schur complement on the linearized Hessian and gradient:

$r_{\text{prior}}(\Delta X) = H_m \Delta X - g_m$

where $H_m$ and $g_m$ are the marginalized Hessian and gradient (Zhang et al., 17 Mar 2025, Koide et al., 8 Feb 2024).

Factor pruning and densification: Some frameworks perform hypothesis culling (top $M$ -best), class/distance pruning, or robustification prior to marginalization for computational efficiency and stability (Papais et al., 27 Feb 2024).

This procedure ensures a constant-size optimization problem with information from marginalized states folded into a dense prior, preserving estimator consistency and reducing drift.

4. Optimization Algorithms and Computational Aspects

The SW-FGO backend solves the nonlinear least-squares objective:

$J(X) = \sum_{\text{IMU}} \|r_{IMU}\|_{\Sigma_I}^2 + \sum_{\text{range}} \rho(\|r_{range}\|_{\Sigma_{T}}^2) + \ldots + \|r_{\text{prior}}\|^2$

using manifold-aware Levenberg–Marquardt algorithms. The core steps are:

Residual linearization: For each factor, compute the Jacobians with respect to connected states, using analytic expressions for high efficiency (Bai et al., 2021, Koide et al., 8 Feb 2024).
Normal equation assembly: Stack all information into sparse block-structured linear systems $(J^T W J + \lambda I)\Delta X = -J^T W r$ , exploiting the limited window size and factor arity.
Update and retraction: States are updated via manifold retractions (e.g., pose update via exponential map in $SE(3)$ ) and iterated until convergence per window step.
Marginalization as dense prior: After convergence, the marginalized “prior” is stored and applied to the remaining states going forward.

Window sizes are chosen to balance information retention and computational load (e.g., $W=20$ poses for IMU at 200 Hz, or 5-second duration for lidar-inertial localization). Per-iteration cost is typically $O(mn)$ ; real-time operation is reported on embedded (Jetson Nano: 21 ms/LM iteration for indoor fusion) and desktop platforms (Zhang et al., 17 Mar 2025, Koide et al., 8 Feb 2024).

5. Adaptive Covariance and Robustness Strategies

SW-FGO excels at integrating time-varying sensor reliability:

Dynamic covariance estimation: Sensor channel metrics (e.g., UWB CIR amplitude, rms-delay-spread) produce adaptive scaling of factor covariances, such as

$\Sigma_T^{(i,j)} = \sigma_0^2\exp(-\beta Q_{ij}) \text{ or } \sigma_{\min}^2 + (\sigma_{\max}^2-\sigma_{\min}^2)(1-\tanh{\gamma Q_{ij}})$

downgrading uncertain measurements in the optimizer (Zhang et al., 17 Mar 2025).

Non-line-of-sight (NLOS) mitigation: NLOS-detected UWB links are either given inflated covariance ( $\Sigma_T \leftarrow \alpha \Sigma_T$ with $\alpha \gg 1$ ) or replaced by soft-penalty factors with low weight $w_{NLOS}$ , suppressing their influence without discarding information (Zhang et al., 17 Mar 2025).
Robust kernels: All cost terms may be protected by robust loss functions (e.g., Huber, Cauchy) to resist outliers from multipath, degeneracy, or spurious correspondences (Zhang et al., 17 Mar 2025, Koide et al., 8 Feb 2024).
Hypothesis pruning and lifted edges: In multi-object tracking, aggressive pruning and skip-edge insertion handle occlusion, misdetection, and reduce the data association search space (Papais et al., 27 Feb 2024).

These strategies maintain estimator stability under signal degradation, multi-path interference, and partial sensor outages, enabling sub-decimeter accuracy even at 40% packet reception rates in real deployments (Zhang et al., 17 Mar 2025).

6. Performance Characteristics and Applications

SW-FGO has demonstrated substantial empirical improvements across challenging applications:

Indoor fusion positioning (IMU-UWB-ultrasonic): Achieves 38% lower RMSE (12.3 cm) compared to EKF baselines, with robust performance in cluttered NLOS environments, and vertical drift suppression by a factor $\geq 6$ (Zhang et al., 17 Mar 2025).
GNSS positioning in urban canyons: Lane-level accuracy ( $\sim$ 1.8–3.0 m) is obtained by leveraging windowed carrier-phase constraints, outperforming EKF-based TDCP fusion, and showing resilience on low-cost platforms (Bai et al., 2021).
3D multi-object tracking: SWTrack delivers improved AMOTA and reduced FP/FN rates by jointly optimizing track assignment hypotheses in a window, outperforming frame-by-frame greedy trackers (Papais et al., 27 Feb 2024).
Range-inertial localization on 3D prior maps: Tightly-coupled scan-to-scan and scan-to-map windowed optimization exhibits robust operation in severe point cloud degeneration and featureless regions, significantly exceeding filter-based approaches (Koide et al., 8 Feb 2024).

In all cases, the tightly-coupled, sliding-window design enables real-time estimation with high resilience to corruption, occlusion, or measurement outage.

7. Representative Frameworks and Implementation Notes

Several open or published frameworks implement SW-FGO in diverse modalities:

System/Application	State Vector	Main Factor Types	Marginalization Approach
IMU-UWB-Ultrasonic Fusion (Zhang et al., 17 Mar 2025)	Pose, velocity, biases	IMU, UWB TDoA, ultrasonic, NLOS	Dense prior (Hessian, gradient)
GNSS Windowed Carrier-Phase (Bai et al., 2021)	ECEF pos., vel., clock	Pseudorange, Doppler, WCP	Drop-oldest state, efficient update
3D Multi-Object Tracking (SWTrack) (Papais et al., 27 Feb 2024)	Track hypotheses	Cost unary, skip/lifted, exclusion	Hypothesis culling, LP relaxation
Range-Inertial on 3D Map (Koide et al., 8 Feb 2024)	Pose, velocity, biases	IMU, scan-to-scan, scan-to-map	Bayes-tree (iSAM2)

GTSAM-style nonlinear optimization on SE(3), robust prior marginalization, and GPU acceleration for point cloud registration are frequently used; window sizes are tailored by scenario and platform to guarantee both information retention and resource efficiency.

SW-FGO is now a mature methodology enabling real-time, robust state estimation in scenarios that defy classical filter-based and batch optimization approaches, with applications ranging from sensor fusion and positioning to multi-hypothesis tracking and SLAM, leveraging the structure and flexibility of factor graphs for bounded-memory, adaptively robust inference.