Global Motion Averaging

Updated 26 May 2026

Global motion averaging is a technique that recovers absolute rigid transformations from noisy relative motion observations, ensuring coherent global configurations.
It employs methods like convex relaxations, block-coordinate descent, and robust loss functions to handle nonconvexity and outlier effects in the data.
This framework is pivotal in applications such as SfM, SLAM, multi-view registration, and map merging, where scalability and precision are essential.

Global motion averaging is a fundamental optimization paradigm within computer vision, robotics, surveying, and geospatial data processing that seeks to simultaneously recover a collection of absolute motions (rigid body transformations) from a sparse and typically noisy set of pairwise relative motions. At its core, global motion averaging provides a principled framework for synthesizing locally estimated pairwise pose constraints into consistent global configurations, enabling applications such as structure-from-motion (SfM), simultaneous localization and mapping (SLAM), multi-view registration of point clouds, and large-scale map merging. The principal challenges of global motion averaging arise from the nonconvexity of the rotation group, the presence of measurement outliers, the heterogeneity of input data quality, and the need for scalability to very large problem sizes.

1. Mathematical Formulation and Problem Scope

Global motion averaging is defined over a measurement graph $G = (V, E)$ where each node $i \in V$ corresponds to an unknown absolute motion $M_i$ (commonly an element of SO(3), SE(3), or SE(2)), and each edge $(i, j) \in E$ represents a noisy observation of the relative motion $M_{ij} \approx M_i^{-1} M_j$ . The canonical objective is to estimate all global motions $\{M_i\}$ that best fit the observed $\{M_{ij}\}$ under a loss, typically: $\min_{\{M_i\}} \sum_{(i,j)\in E} w_{ij} \cdot \rho\bigl(\text{dist}(M_{ij}, M_i^{-1} M_j) \bigr)$ where $w_{ij} \geq 0$ are confidence weights, $\rho(\cdot)$ is a robust penalty (e.g., squared loss, Huber, MAGSAC), and $i \in V$ 0 is an appropriate group-invariant metric (Frobenius, geodesic, chordal, or angle-axis disparity) (Dellaert et al., 2020, Li et al., 2020, Pan et al., 2024, Tao et al., 4 Jul 2025).

Common specializations include:

Rotation averaging (SO(3)): Recovering sensor or camera orientations.
SE(3) motion averaging: Recovering full pose including translation.
Hybrid or hierarchical multi-camera settings: Incorporating rigid multi-camera rigs (Tao et al., 4 Jul 2025).

The problem is nearly always nonconvex due to the group structure, and prone to ambiguities in scale, gauge, and global reference frame.

2. Algorithmic Approaches and Optimization Hierarchies

A wide spectrum of techniques have been developed to address global motion averaging efficiently and robustly:

2.1 Convex Relaxations and Certification

Semidefinite programming (SDP) relaxations can convexify the rotation averaging problem by lifting to a matrix variable $i \in V$ 1 and relaxing the rank constraint (Dellaert et al., 2020). Shonan rotation averaging utilizes a low-rank Burer–Monteiro factorization and pulls the problem onto manifolds $i \in V$ 2 ( $i \in V$ 3), combining manifold optimization with a Riemannian staircase scheme. Global optimality can be certified a posteriori by verifying eigenvalue conditions on the certificate matrix; as $i \in V$ 4 increases, the lifted problem becomes easier, and the original $i \in V$ 5 solution is recovered when the certificate is passed (Dellaert et al., 2020).

2.2 Block Coordinate and Parallel Methods

Alternating or block coordinate descent (BCD) methods decouple the global update into independent subproblems for each node that admit closed-form SVD-based solutions (reprojecting to SO(3)). These can be done serially (standard BCD) or in parallel via surrogate majorization (SUM/MM), both with convergence guarantees to stationary points and, frequently, with practical global optimality (Dong et al., 2021). Such methods are scalable for large $i \in V$ 6 and are widely used as the computational workhorses of modern systems.

2.3 Robustification and Outlier Handling

Robust losses such as Huber, L1, Tukey, or MAGSAC are essential in real-world settings. Modern pipelines replace quadratic costs with robust penalties and employ iterative reweighted least-squares (IRLS), either on the manifold (geodesic or chordal residuals) or after covariance propagation from two-view geometry (Zhang et al., 2023, Pan et al., 2024). Outlier edges are pruned via loop-consistency checks, maximum spanning tree sampling, or spanning subgraph sampling (Jiang et al., 2017, Chen et al., 2021). Recent work demonstrates that leveraging measurement uncertainties estimated from two-view geometry and fusing them as edge weights in the averaging yields improved accuracy and robustness (Zhang et al., 2023).

2.4 Alternating Scheme: Rotation and Translation Averaging

In SE(3), the problem classically decouples into:

Rotation averaging (as above)
Translation averaging: Given solved rotations, estimate translations via sparse linear least-squares or L1-constrained formulations, with or without additional per-cluster scale ambiguity (multi-cluster SFM) (Zhu et al., 2017, Tao et al., 4 Jul 2025).

Recent approaches increasingly opt for joint or hybrid parameterizations, absorbing scale ambiguities by introducing explicit per-track or per-edge scale variables or via joint camera-point optimization (Pan et al., 2024, Tao et al., 4 Jul 2025).

3. Applications and Special Cases

Global motion averaging is the backbone of numerous core problems:

Structure-from-Motion (SfM) and Visual SLAM: Computes camera poses and scene geometry from unordered or sequential image collections. Modern global SfM systems (e.g., GLOMAP, MGSfM) use motion averaging in both rotation and camera-point positioning stages for enhanced scalability and accuracy (Pan et al., 2024, Tao et al., 4 Jul 2025). Hybrid incremental-global pipelines inject averaged rotations as regularizers into incremental BA (Chen et al., 2021).
Large-scale Map Merging and Multi-view Registration: In robotics and remote sensing, grid maps or point clouds from multiple sources (robots or sensors) are fused into a global frame using robust motion averaging with spanning-tree or robust graph-sampling initializations (Jiang et al., 2017, Xu et al., 2024). The grid structure is often exploited for efficient data association and neighbor search.
Collinear and Degenerate Geometries: Collinear camera trajectories can cause classical translation averaging to become ill-posed or degenerate. Specialized spectral characterizations and rank-constrained averaging schemes, along with the introduction of virtual cameras, extend global motion averaging to mixed collinear/non-collinear cases, avoiding catastrophic failures (Geifman et al., 2019).

4. Robustness, Optimality, and Certification

A central concern is whether solutions are globally optimal and robust to real-world noise and outliers. Empirical and theoretical advances include:

Global certificates: Shonan's eigenvalue test, Laplacian bounds on residuals (Dellaert et al., 2020, Li et al., 2020), or optimality certificates from semi-definite relaxations (Dong et al., 2021).
Edge pruning and inlier selection: Methods such as weighted spanning-tree Monte Carlo, IRLS with edge pruning, and fast view-graph filtering (VGF) dramatically mitigate the impact of outlier or inconsistent constraints (Jiang et al., 2017, Chen et al., 2021, Li et al., 2020).
Covariance propagation: Direct estimation and use of the Jacobian-based uncertainty from two-view geometry improves edge weighting and confidence modeling (Zhang et al., 2023).
MAGSAC and threshold-free robust loss: MAGSAC-style marginalization over noise scales eliminates the need for hand-tuned inlier thresholds, automatically adapts to varying data quality, and increases resilience to outliers (Zhang et al., 2023).

5. Scalability, Complexity, and System Integration

Modern global motion averaging systems are engineered to scale to tens or hundreds of thousands of nodes (cameras/maps), frequently leveraging cluster-based or hierarchical decomposition (Zhu et al., 2017). Key factors contributing to efficiency:

Closed-form SVD steps for rotation blocks: Each node update requires only a $i \in V$ 7 SVD, and parallelism is straightforward in BCD or SUM approaches (Dong et al., 2021).
Sparse linear solves for translation averaging: Once rotations are fixed, translation estimation reduces to large but sparse linear algebra (Zhu et al., 2017, Xu et al., 2024).
Hybrid convex–nonconvex pipelines: Convex initialization phases followed by robust nonconvex or manifold optimization, sometimes over multi-camera or joint camera-point spaces (Tao et al., 4 Jul 2025, Pan et al., 2024).

Complexity is typically linear or nearly linear in $i \in V$ 8 due to sparsity, and empirical runtimes for state-of-the-art systems now span seconds to minutes for problems of industrial scale (Dellaert et al., 2020, Pan et al., 2024, Tao et al., 4 Jul 2025).

6. Extensions and Current Directions

Recent developments are pushing the scope of global motion averaging to:

Joint optimization over cameras and points with per-edge scale absorption: Replaces classical translation averaging with a unified sparse LM over camera positions, 3D tracks, and unknown scales, increasing both robustness and initialization-freedom (Pan et al., 2024, Tao et al., 4 Jul 2025).
Hybrid and hierarchical models for multi-camera sensor rigs: Decoupled internal vs. global rotations and positions, convex plus non-bilinear unconstrained refinements, and simultaneous handling of camera-to-camera and camera-to-point constraints (Tao et al., 4 Jul 2025).
Outlier-robust, scalable multi-view registration for extremely large DSM and point set problems: Fast, O(N) complexity graph-based pipelines suitable for hundreds of millions of points, leveraging data structures intrinsic to the input modality (e.g., DSM grids) (Xu et al., 2024).
Collinearity-aware global averaging: Spectral methods and augmentation of the problem graph through synthetically introduced virtual cameras (Geifman et al., 2019).

Open challenges remain in the design of globally optimal yet highly efficient solvers in the presence of correlated noise, dynamic scenes, and partial or missing data.

The table below summarizes representative algorithmic themes, robustification strategies, and scaling properties from key works:

Category	Example Methods/Papers	Notes
Convex Relaxation	Shonan, SDP, LR-SDP	Global certifiability (Dellaert et al., 2020)
Block Coordinate/Parallel	BCD, SUM/MM	Closed-form, parallel (Dong et al., 2021)
Robust Losses	IRLS, Huber, MAGSAC	Outlier resilience (Zhang et al., 2023, Pan et al., 2024)
Hierarchical/Multi-camera	MGSfM, hybrid pipelines	Rigid-unit structure (Tao et al., 4 Jul 2025)
Collinearity-Aware	Rank 4, virtual cameras	Special handling (Geifman et al., 2019)
Large-scale Scene Registration	DSM-ICP, O(N) SVD+linear	Memory-aware, DSM grids (Xu et al., 2024)

Global motion averaging underpins high-accuracy, scalable geometric estimation in modern 3D vision, robotics, and mapping, continuously evolving through advances in robust loss design, algorithmic efficiency, and system integration.