Papers
Topics
Authors
Recent
Search
2000 character limit reached

GVINS: GNSS-Visual-Inertial Navigation System

Updated 7 February 2026
  • GVINS is a sensor fusion framework that integrates GNSS, visual, and inertial measurements to achieve globally consistent 6-DoF state estimation.
  • Its factor graph formulation combines IMU preintegration, visual reprojection, and GNSS constraints to mitigate global drift and handle urban multipath challenges.
  • Innovative variants like Sky-GVINS and GS-GVINS extend the framework with sky segmentation and dense 3D mapping for enhanced robustness in challenging environments.

GVINS (GNSS-Visual-Inertial Navigation System) is a class of tightly coupled sensor fusion frameworks that integrate Global Navigation Satellite System (GNSS) signals, visual measurements (from monocular or stereo cameras), and inertial data (from IMUs) for robust, real-time, globally referenced 6-DoF state estimation in large-scale navigation scenarios. GVINS solves the problem of global drift in visual-inertial odometry (VIO) by introducing conventional or raw GNSS measurements as explicit constraints in a factor graph or filtering architecture, producing globally consistent trajectories even under challenging signal availability and urban multipath conditions (Cao et al., 2021).

1. Core Principles and Factor Graph Formulation

The defining characteristic of GVINS is the joint estimation of navigation states by constructing and optimizing a factor graph containing the following measurement models:

  • IMU Preintegration Factors: Modelized based on continuous/discrete mechanization and noise propagation, possibly including effects such as Earth rotation and Coriolis (Cao et al., 2021, Tang et al., 2022).
  • Visual Reprojection Factors: Each feature track gives rise to reprojection error terms, formulated over inverse depth or, in certain variants, using "pose-only" parameterizations to avoid explicit feature depth estimation (Cao et al., 2021, Xu et al., 13 Jan 2025).
  • GNSS Factors: Raw pseudorange and Doppler (velocity) measurements, or (in RTK/PPP variants) carrier-phase integer ambiguities, are modeled as nonlinear scalar constraints on the trajectory nodes, clock biases/drifts, and, where necessary, inter-sensor extrinsics. These can be tightly fused with visual-inertial factors in the joint optimization (Cao et al., 2021, Hu et al., 2024).

The generic optimization cost over the sliding window of states X\mathcal{X} is of the form:

J(X)=rIMUΣIMU2+rvisΣvis2+rGNSSΣGNSS2+J(\mathcal{X}) = \sum \|r_{\rm IMU}\|^2_{\Sigma_{\rm IMU}} + \sum \|r_{\rm vis}\|^2_{\Sigma_{\rm vis}} + \sum \|r_{\rm GNSS}\|^2_{\Sigma_{\rm GNSS}} + \cdots

Marginalization is employed to maintain bounded computational complexity, with window sizes typically set to $10$–$20$ keyframes and their associated feature tracks.

2. State Vector and Measurement Models

A representative GVINS state vector at window index kk is:

Xk=[pbkw;qbkw;vbkw;bak;bgk;otk;sk;{ρik}i=1..N]X_k = [p^w_{b_k}; q^w_{b_k}; v^w_{b_k}; b_{a_k}; b_{g_k}; o_{t_k}; s_k; \{\rho_{ik}\}_{i=1..N}]

Where:

  • pbkwp^w_{b_k}: body position in world frame
  • qbkwq^w_{b_k}: orientation (quaternion) in SO(3)SO(3)
  • vbkwv^w_{b_k}: velocity
  • bakb_{a_k}, bgkb_{g_k}: accelerometer, gyroscope biases
  • otko_{t_k}: GNSS receiver clock bias
  • sks_k: clock drift rate
  • {ρik}\{\rho_{ik}\}: inverse depths of visual features tracked at kk (Yin et al., 2023)

Key measurement models:

  • IMU: Preintegration residuals as in [Forster et al.], including biases and gravity alignment.
  • Visual: Pinhole or other camera model reprojection residuals for tracked features.
  • GNSS:
    • Pseudorange: rρik=ρikRecksiecotkr_{\rho_{ik}} = \rho_{ik} - \|\text{Rec}_k - s_{ie}\| - c o_{t_k}
    • Carrier phase: rϕik=ϕik(Recksie+cotk+λNi)r_{\phi_{ik}} = \phi_{ik} - (\|\text{Rec}_k - s_{ie}\| + c o_{t_k} + \lambda N_i)

Extensions enable, for instance, multi-constellation clock modeling (vector and drift for GPS, GLONASS, Galileo, BeiDou, etc.) (Cao et al., 2021).

3. Initialization, Observability, and Degeneracy Handling

GVINS frameworks incorporate a coarse-to-fine initialization to align a local VIO map with the global GNSS frame (ECEF or ENU):

  • Anchor Initialization: Single-point GNSS position fixes translation; short-window joint optimization aligns yaw and corrects for clock bias/drift.
  • Yaw/Attitude Alignment: Performed via batch minimization of Doppler (velocity) residuals over brief temporal windows (Cao et al., 2021).
  • Online Extrinsic Calibration: Some implementations incorporate online estimation of GNSS-IMU extrinsics leveraging Doppler and pseudorange residuals (Hu et al., 2024).

GVINS explicitly handles degeneracies, degrading gracefully to VIO in case of satellite outages or low Doppler informativeness. Even with fewer than four satellites, the estimator can still constrain subsets of the global pose space, preventing catastrophic drift (Cao et al., 2021).

4. Notable Extensions and Framework Variants

Several research efforts have extended baseline GVINS in specific directions:

  • Sky-GVINS: Integrates a fisheye sky-pointing camera and Otsu-thresholded sky segmentation to filter out non-line-of-sight (NLOS) GNSS measurements. Satellites whose projections fall on non-sky pixels are rejected, mitigating urban-canyon multipath and reducing ATE by up to 80% in dense environments (Yin et al., 2023).
  • IC-GVINS (INS-centric): Leverages a precise INS as the backbone for both state propagation and visual front-end aiding. INS outputs (with Earth-rotation correction) provide priors for tracking, outlier-culling, and triangulation, dramatically improving robustness in visually challenging scenes (Tang et al., 2022).
  • GS-GVINS: Augments GVINS with a differentiable, dense 3D Gaussian Splatting (3DGS) map. A photometric rendering factor supplies additional constraints via analytical Jacobians with respect to the pose; a pruning strategy maintains map quality under dynamic motion. Translation RMSEs in urban scenarios are consistently reduced by 4–71% over comparable methods (Zhou et al., 16 Feb 2025).
  • PO-GVINS: Adopts a "pose-only" filtering formulation that eliminates explicit landmark coordinates, representing feature depth using only pairs of camera poses. This minimizes dimensionality and numerical issues, with integer ambiguity resolution for GNSS carrier-phase incorporated directly (Xu et al., 13 Jan 2025).
  • SRI-GVINS: Implements the pipeline within a square-root inverse sliding-window filtering (SRI-SWF) framework, directly manipulating the square-root information form for efficient, numerically stable assimilation of high-rate vision, inertial, and all classes of GNSS measurements (prange, Doppler, single/double-difference). Online GNSS-IMU calibration and sequential frame initialization are performed in-filter, achieving state-of-the-art real-time accuracy with low computational overhead (Hu et al., 2024).

5. Experimental Outcomes and Benchmarking

GVINS systems have been validated in extensive real-world scenarios, including urban canyons, open-sky highways, and indoor-outdoor traverses:

  • Urban Driving: GVINS achieves mean ATE RMSE as low as 4.5 m over 23 km with continuous operation; visual inertial only methods fail or drift unbounded after hundreds of meters (Cao et al., 2021).
  • Challenging Environments: Sky-GVINS demonstrates ATE reductions over baseline GVINS by more than 80% in urban canyon conditions, with IOU ≈ 97% for the Otsu sky-mask and negligible runtime overhead (Yin et al., 2023).
  • High-Integrity Navigation: IC-GVINS and GS-GVINS maintain sub-meter global accuracy and high robustness in mixed-feature or GNSS-poor locales, due to INS-aiding or dense map constraints (Tang et al., 2022, Zhou et al., 16 Feb 2025).
  • Filter Efficiency: SRI-GVINS demonstrates per-frame costs of 8.7 ms on conventional CPUs and 34.7 ms on embedded SoCs, outperforming equivalent batch-graph approaches in both speed and numerical stability (Hu et al., 2024).

6. Open Problems and Future Directions

Current and future work addresses several open problems:

  • Observability-Aware Estimation: Theoretical and empirical studies aim to characterize estimator performance under practical degeneracies and to selectively lock or relax unobservable directions (Cao et al., 2021, Hu et al., 2024).
  • NLOS and Multipath Mitigation: Leveraging learned or geometric priors (e.g., clouds, urban geometric ray-intersection) and further integrating sky segmentation as a factor (Yin et al., 2023, Zhou et al., 16 Feb 2025).
  • Dense and Multi-Modal Mapping: Joint optimization of continuous 3D maps (Gaussian Splatting, learned NeRF-like models) within the factor graph to enhance performance under severe feature impoverishment or motion dynamics (Zhou et al., 16 Feb 2025).
  • Carrier-Phase and PPP/RTK Fusion: Ongoing efforts target robust, real-time handling of integer ambiguity resolution for globally consistent centimeter-level positioning across multi-constellation networks (Xu et al., 13 Jan 2025, Hu et al., 2024).
  • Real-Time Embedded Deployment: Algorithmic and implementation advances focus on bounded-latency, resource-optimized corollaries for field robotics and UAVs in constrained computational environments (Tang et al., 2022, Hu et al., 2024).

7. Summary Table: GVINS Variants and Key Features

Variant Key Sensor Integration Unique Algorithmic Feature
GVINS Camera, IMU, GNSS Tightly-coupled factor graph, Doppler
Sky-GVINS Camera, IMU, GNSS, Sky Fisheye-based sky segmentation, NLOS rejection (Yin et al., 2023)
IC-GVINS INS, Camera, GNSS INS-aided front and back-ends (Tang et al., 2022)
GS-GVINS Camera, IMU, GNSS, 3DGS Photometric 3D Gaussian Splat Map (Zhou et al., 16 Feb 2025)
PO-GVINS Camera, IMU, GNSS Pose-only filtering, avoids explicit landmark estimation (Xu et al., 13 Jan 2025)
SRI-GVINS Camera, IMU, GNSS Square-root inverse sliding window filter (Hu et al., 2024)

GVINS frameworks have established themselves as the dominant paradigm for real-time, globally drift-free, and robust navigation across a wide span of robotics, UAV, and autonomous vehicle applications, with a rapid evolution of algorithmic sophistication focused on sensor fusion, scalability, and resilience to real-world degradation modes.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to GVINS.