GraphSLAM: Graph-Based SLAM

Updated 10 March 2026

GraphSLAM is a graph-based SLAM technique that formulates localization and mapping as a global non-linear least squares optimization problem.
It constructs a sparse factor graph where nodes represent robot poses and landmarks while edges encode odometry and semantic constraints.
The approach enables applications in multi-agent mapping and autonomous systems by integrating heterogeneous sensor data and ensuring global consistency.

GraphSLAM (Graph-based Simultaneous Localization and Mapping) is a class of algorithms for robotic SLAM that formulates the estimation problem as a global, non-linear, least-squares optimization over a sparse factor graph. Each node in the graph encodes robot states (typically SE(2) or SE(3) poses) and/or map variables (landmarks, object parameters, semantic regions), while edges represent spatial constraints derived from odometry, exteroceptive sensor measurements, or semantic associations. The underlying optimization seeks the set of state variables that best satisfy all constraints, typically in a probabilistic (maximum-likelihood or maximum a posteriori) sense.

1. Factor-Graph Formulation and State Representation

At the core of GraphSLAM is the construction of a factor (or pose) graph, $\mathcal{G} = (\mathcal{V}, \mathcal{E})$ , where:

$\mathcal{V}$ contains nodes for variables such as robot keyframe poses $x_i \in \mathrm{SE}(3)$ , low-level features $L_j \in \mathbb{R}^3$ , semantic landmarks (objects, planes, rooms), and, in some approaches, velocities or biases.
$\mathcal{E}$ encodes constraints (factors), each representing a probabilistic relationship between variables: odometry, loop closures, landmark observations, or higher-level semantic associations.

The general probabilistic structure is

$P(\mathbf{x}, L \mid \mathcal{Z}, U) \propto \prod_{\text{odometry}} p(x_{i+1} \mid x_i, u_i) \prod_{\text{measurements}} p(z_{ij} \mid x_i, L_j)$

where $U$ collects control inputs and $\mathcal{Z}$ the observations. Each factor can typically be written as a Gaussian potential,

$\psi(x_a, x_b) \propto \exp\left(-\frac{1}{2} \|e(x_a, x_b; z)\|^2_{\Sigma^{-1}}\right)$

with $e(\cdot)$ the residual and $\Sigma$ the measurement covariance.

Extended state spaces are widespread: e.g., PROB-SLAM adds semantic-probabilistic edge weights to each feature observation (Meng et al., 2022); Multi S-Graphs overlays robot poses, planes, rooms, and floors into a four-level hierarchy (Fernandez-Cortizas et al., 2023); LG-SLAM uses an augmented state including velocities and actuator biases (Montano-Oliván et al., 2024); and object-SLAM augments the pose graph with nonparametric (Dirichlet-process-based) data-association variables (Mu et al., 2017).

2. Measurement Models and Types of Factors

The construction of constraint edges is application-dependent but follows a common taxonomy:

Odometry/Tracking Factors: Pairwise SE(2) or SE(3) constraints from wheel encoders, IMU, visual odometry, or LiDAR scan registration, $r_{odom}(x_{i-1}, x_i) = \log(\hat T_{i-1, i}^{-1} (x_{i-1}^{-1} x_i))$ (Thomas et al., 23 Jun 2025, Wei et al., 2021, Montano-Oliván et al., 2024).
Landmark/Feature Observation Factors: Connection between a pose and a quantified map element, e.g., reprojection error for feature $i$ in pose $k$ ,

$e^r_{k,i}(X_k, L_i) = u^\text{obs}_{k,i} - \pi(X_k \oplus L_i)$

where $\pi(\cdot)$ projects world points to the image (Meng et al., 2022), or range-bearing for cones in automotive SLAM (Alvarez et al., 2022).

Semantic Constraints: Constraints between poses and semantic objects/planes/rooms, where associations can be hierarchical (plane to room, room to floor (Fernandez-Cortizas et al., 2023)) or soft (probability-weighted based on detection confidence (Meng et al., 2022)).
Loop Closure Factors: Non-consecutive pose-pose or submap-submap constraints from scan/context registration, object/scene recognition, or semantic descriptor matching; critical for global consistency (Montano-Oliván et al., 2024, Thomas et al., 23 Jun 2025).
Additional Constraints: Domain-specific factors such as physical widths (multi-lane perception (Abramov et al., 2017)), switchable constraints for dynamic object rejection, or robustification via outlier detection/voting (Montano-Oliván et al., 2024).

Measurement models are always accompanied by explicit covariance/information matrices reflecting sensor uncertainty and often adaptively weighted by confidence, e.g., using semantic probability maps (Meng et al., 2022).

3. Optimization Objectives and Solution Algorithms

The aggregate objective is a nonlinear least-squares cost over all residuals: $J(\mathbf{x}) = \sum_{\text{edges}} \|r_\text{edge}(\cdot)\|^2_\Omega$ Typical choices for solution include Gauss–Newton or Levenberg–Marquardt algorithms. The system is linearized at each iteration, leading to sparse normal equations $H \delta x = -b$ , with $H$ the (Jacobian-weighted) information matrix and $b$ the residual vector.

Major solvers in use are:

g2o: Widely adopted, supports customizable vertex and edge types, and is highly efficient for sparse graphs (Alvarez et al., 2022, Wei et al., 2021, Meng et al., 2022).
GTSAM: Implements variable elimination and variable reordering, used in large-scale 3D systems such as GRAND-SLAM (Thomas et al., 23 Jun 2025).
Custom/Parallelized CUDA backends: Exploited for real-time operation in GPU-rich systems (Montano-Oliván et al., 2024).

Some systems employ fixed-lag or windowed smoothing to maintain real-time performance, marginalizing out old variables into priors for tractable optimization (Montano-Oliván et al., 2024, Abramov et al., 2017).

4. Data Association and Semantics

Data association—linking observations to purported landmarks—is crucial and varies in approach:

Nearest-Neighbor or KNN Gating: Using Mahalanobis distance in measurement space (Alvarez et al., 2022, Abramov et al., 2017).
Soft Probabilistic Assignment: Semantic probability maps yield per-keypoint weights (e.g., from YOLOv5 Gaussian confidence) to modulate edge influence without hard thresholding (Meng et al., 2022).
Nonparametric/Dirichlet-Process Inference: Landmark instantiation is not fixed; Dirichlet process assigns measurement associations and grows the map as justified by evidence, allowing for joint data-association and trajectory inference (Mu et al., 2017).
Collaborative Semantic Matching: In distributed settings, agents exchange higher-level semantic summaries (room, plane descriptors), not raw measurements, for loop closure and data fusion (Fernandez-Cortizas et al., 2023).

Semantic information is often encoded directly into the factor graph, e.g., via additional nodes (rooms, floors), semantic-plane constraints, or unary semantic penalties. This semantic augmentation reduces false loop closures and increases map interpretability (Fernandez-Cortizas et al., 2023, Meng et al., 2022).

5. Submap Management and Loop Closure

Submaps—local aggregations of scans or features anchored to keyframes—are a recurrent strategy:

Anchoring: Each submap is rigidly linked to a keyframe pose; after optimization, all points are transformed accordingly (Montano-Oliván et al., 2024, Thomas et al., 23 Jun 2025).
Pruning: Only recent scans are retained per submap to bound memory and computational cost, with old submaps dynamically loaded or discarded as necessary (Montano-Oliván et al., 2024).
Loop Closure: Candidate loop closures are identified via pose/topological heuristics, submap-submap registration (ICP, NDT), or descriptor retrieval (e.g., NetVLAD for multi-agent SLAM (Thomas et al., 23 Jun 2025)). Outlier rejection mechanisms (voting or robust loss kernels) uphold global consistency.

A summary of these mechanisms is:

System	Submap Management	Loop Closure Cue
LG-SLAM	Sliding window, anchoring, dynamic load	ICP/NDT + Mahalanobis + voting (Montano-Oliván et al., 2024)
GRAND-SLAM	Local submaps, global transform	NetVLAD, frame-to-model, ICP, filter (Thomas et al., 23 Jun 2025)
Ground-SLAM	Sensor-centric, covariance-aware	Scan-to-map ICP, ground-plane constraints (Wei et al., 2021)

6. Application Domains and Empirical Performance

GraphSLAM frameworks are now applied across diverse robotic contexts:

Large-Scale and Multi-Agent SLAM: GRAND-SLAM demonstrates multi-robot, outdoor, and indoor mapping with Gaussian splatting, achieving multi-agent RMSE as low as 0.25 cm on Replica and 4.99 m on Kimera-Multi, outperforming single-agent and previous multi-agent Gaussian SLAM (Thomas et al., 23 Jun 2025).
Semantic SLAM: Incorporation of semantic edges/weights, as in PROB-SLAM and Multi S-Graphs, yields 15% ATE improvements and reduces loop closure outliers by more than a factor of three versus geometric-only baselines (Meng et al., 2022, Fernandez-Cortizas et al., 2023).
Automotive/AD Systems: GraphSLAM-based multi-lane perception enables high-quality ego and adjacent lane tracking up to 120 m (Abramov et al., 2017), while high-speed race scenarios achieve sub-15 cm RMS cone-mapping at over 70 kph (Alvarez et al., 2022).
Range-Inertial/Mining: LG-SLAM attains < 20 cm ATE across challenging environments, including underground mines and offices, by integrating IMU, LiDAR, and GNSS into a unified factor graph (Montano-Oliván et al., 2024).
Object-SLAM: Nonparametric pose-graph models recover the correct number and pose of objects under severe data association ambiguity (Mu et al., 2017).

Empirical advantages stem from the ability of GraphSLAM to incorporate heterogeneous data, enforce global consistency via loop closures, and exploit semantic cues to disambiguate structural symmetries or dynamics.

7. Limitations and Challenges

While GraphSLAM methods are effective across domains, several persistent challenges have been identified:

Reliance on Data Association: Failure of data association (e.g., in ambiguous or perceptually aliased environments) can result in catastrophic errors. Nonparametric and semantic approaches mitigate this but add computational and modeling complexity (Mu et al., 2017, Fernandez-Cortizas et al., 2023).
Scalability: Global optimization over large graphs incurs significant computational cost, prompting work on windowing, prioritization, and distributed protocols (Montano-Oliván et al., 2024, Fernandez-Cortizas et al., 2023, Thomas et al., 23 Jun 2025).
Dynamic Scenes: Moving objects and scene changes degrade performance unless explicitly modeled or downweighted through probability or semantic masking (Meng et al., 2022).
Dependence on Reliable Semantics: Semantic constraint reliability is fundamental—failure in segmentation or object extraction propagates through the graph, diminishing robustness (Fernandez-Cortizas et al., 2023).
Communication/Integration in Multi-Agent Systems: Efficient factor exchange (preferably semantic summaries vs. raw sensor data) is critical for bandwidth and scalability; full global consistency across all agents remains a challenge (Fernandez-Cortizas et al., 2023, Thomas et al., 23 Jun 2025).

Future research targets richer multi-modal integration, improved global optimization across distributed teams, tighter semantic coupling, and further reduction of reliance on hand-tuned front-end parameters and associations.