FAST-LIVO2: Efficient LiDAR-Inertial Visual Odometry
- FAST-LIVO2 is an efficient LiDAR-Inertial-Visual Odometry system that fuses sensor data to achieve centimeter-level accuracy in real time.
- It employs a sequential ESIKF pipeline with direct registration and unified mapping, reducing computational and memory demands on embedded platforms.
- The system demonstrates robustness under sensor degeneracy and supports diverse applications including UAV navigation, airborne mapping, and 3D reconstruction.
FAST-LIVO2 refers to a class of efficient, direct LiDAR-Inertial-Visual Odometry (LIVO) systems designed for accurate, real-time state estimation and mapping in robotics and autonomous systems. It extends tightly-coupled sensor fusion with algorithmic advances in sequential Kalman filtering, direct registration, unified mapping, and computational efficiency. The framework is widely recognized for its state-of-the-art accuracy, robustness to sensor degeneracy, and suitability for resource-constrained embedded platforms (Zheng et al., 2024, Zhou et al., 23 Jan 2025).
1. System Model and Sensor Fusion Pipeline
FAST-LIVO2 implements a tightly-coupled fusion of LiDAR, IMU, and monocular (or stereo) vision streams within a sliding-window recursive estimator. The system architecture includes:
- Scan Recombination and Motion Compensation: Raw, high-frequency LiDAR points are synchronized and motion-compensated to the camera timestamp using inertial mechanization.
- Error-State Iterated Kalman Filter (ESIKF): The core estimator integrates IMU, LiDAR, and image measurements on the manifold, with a 19-dimensional state comprising attitude, position, velocity, biases, gravity, and photometric exposure (Zheng et al., 2024, Zhou et al., 23 Jan 2025).
- Sequential Update Strategy: Modality-dependent dimension mismatch is managed by factorizing the posterior. LiDAR and visual updates are carried out sequentially within the IEKF: LiDAR measurements update first for geometric alignment, followed by visual measurements for photometric consistency (Zheng et al., 2024).
- Unified Voxel Map: All geometric (LiDAR) and visual (image patch) information is registered in a single, sliding-window, hash-indexed voxel octree structure, enabling direct fusion at the map level (Zheng et al., 2024).
The overall pipeline at each processing cycle is as follows: IMU propagation → LiDAR ESIKF update → Visual ESIKF update → Local map geometry/patch update.
2. Key Algorithmic Components
Direct Registration
Both LiDAR and visual modules use direct registration:
- LiDAR Point-to-Plane: For each de-skewed LiDAR point, a local plane is fitted from neighboring map points. The system minimizes the point-to-plane residual—without explicit edge/plane feature extraction—within the IEKF (Xu et al., 2021, Zheng et al., 2024).
- Sparse-Direct Photometric Alignment: For selected visual map points, direct pixel-level photometric error is minimized using affine-warped image patches, accounting for exposure time as part of the state (Zheng et al., 2024).
Unified Mapping
- Voxel Structure: Root voxels (0.5 m³) are managed in a hash map; each leaf holds plane priors along with visual map points with patch pyramids. This structure provides mutual benefit: LiDAR-derived plane priors inform photometric tracking; image exposure/lighting is adaptively estimated and updated (Zheng et al., 2024).
- Reference Patch Selection and Raycasting: Reference patches are dynamically scored and replaced for robustness. On-demand raycasting is used in visual blind zones, extracting candidate features via voxel lookups based on image grid coverage (Zheng et al., 2024).
3. Performance Metrics and Comparative Benchmarks
FAST-LIVO2 demonstrates significant advances in multiple performance dimensions:
| Method | Translational RMSE (m) | CPU (ms/frame, x86) | Embedded (ms/frame, ARM) | Memory (GB) | Dataset |
|---|---|---|---|---|---|
| FAST-LIVO2 | 0.044 | 30 | 78 | 2.5 | Hilti, Mars-LVIG |
| FAST-LIO2 | 0.151 | — | — | — | — |
| LVI-SAM | 1.93 | — | — | — | — |
| Ours†[2501...] | 0.063 | 26 | 57.8 | 1.7 | Hilti, Private |
†"Ours" denotes a resource-optimized variant of FAST-LIVO2 with adaptive visual keyframing and a two-tiered map (Zhou et al., 23 Jan 2025).
Key insights:
- Robust centimeter-level accuracy (RMSE as low as 0.044 m), outperforming feature-based and indirect fusion competitors in both public and private datasets.
- Real-time operation at 10 Hz on commodity x86 CPUs (<31 ms/frame), and on embedded ARM boards (<80 ms/frame).
- Robustness under sensor degeneracy and severe brightness or texture changes (via on-demand raycasting and dynamic exposure estimation).
- Memory usage and computational footprint are further reduced in variants for edge devices, with only minor accuracy trade-offs (Zhou et al., 23 Jan 2025).
4. Handling Sensor Degeneracy and Resource Constraints
FAST-LIVO2 and its derivatives implement principled strategies for managing sensor degeneracy:
- Degeneracy-Aware Visual Frame Selector: Employs a LiDAR-constraint metric based on the normal matrix spectrum. In degenerate scenes (e.g., featureless walls), all images are promoted to keyframes, ensuring visual tracking is not compromised (Zhou et al., 23 Jan 2025).
- Compute and Memory Efficiency: A hybrid mapping strategy is used: a small, local robocentric surfel map (LiDAR + visual patches) is complemented by a long-term global visual map for sparse features, culling out-of-range voxels and detaching surfels when possible. This achieves a ≈47% memory reduction and ≈33% runtime reduction with only a ≈3 cm RMSE increase (Zhou et al., 23 Jan 2025).
- ARM-Specific Optimization: Evaluation on ARM SoCs demonstrates robust operation (<2 cm drift in real-world trials) and real-time performance in embedded deployments (Zhou et al., 23 Jan 2025).
5. Applications and Downstream Uses
FAST-LIVO2 supports a range of robotics and vision applications:
- Onboard UAV Navigation: Onboard compute enables closed-loop control with end-to-end drifts <10 cm in indoor, outdoor, and narrow-opening tests (Zheng et al., 2024).
- Airborne Mapping and City-Scale Modeling: Demonstrated in Mars-LVIG benchmarks with dense, colored point clouds produced in real time and RMSE down to 0.27 m (Zheng et al., 2024).
- 3D Reconstruction and Rendering: Dense maps are suitable for mesh+texture generation and data-driven NeRF/3DGS pipelines, offering high-quality geometry and photometric consistency (Zheng et al., 2024).
- Resource-Constrained Deployment: The system runs efficiently on edge devices, enabling real-time LIVO on platforms without a discrete GPU (Zhou et al., 23 Jan 2025).
6. Algorithmic Innovations and Comparison to Prior Work
FAST-LIVO2 builds on the foundational FAST-LIO2 LiDAR-inertial system (Xu et al., 2021) by introducing:
- Direct LiDAR-Visual Fusion: Prior works often required explicit, hand-engineered extraction of edge, plane, or point features, or relied on loosely-coupled optimization. FAST-LIVO2 eliminates this, directly fusing raw sensor streams for improved geometric and photometric observability (Zheng et al., 2024).
- Sequential ESIKF: Sequential handling of multimodal measurements provides better numerical conditioning and compatibility with highly heterogeneous data (Zheng et al., 2024).
- Unified Map Supporting Direct Visual and Geometric Updates: Plane priors and patch-pyramid structures enable cross-modal benefits and improved accuracy under ambiguous conditions (Zheng et al., 2024).
- Algorithmic Efficiency: The incremental k-d tree ("ikd-Tree") of FAST-LIO2 was extended to hash-octree voxel management and hybrid feature mapping, reducing per-point insertion and query times to O(log N) (Xu et al., 2021).
- Dynamic Reference Patching and Raycasting: These methods robustify alignment in degenerate scenes, corresponding to observed robustness on long, featureless traversals (Zheng et al., 2024).
7. Open Source Availability and Extensibility
The FAST-LIVO2 codebase, benchmarks, and datasets are openly available at https://github.com/hku-mars/FAST-LIVO2 and are fully ROS-compatible for rapid integration with robotic autonomy stacks (Zheng et al., 2024). Key parameters—patch size, map voxelization, LiDAR downsampling, and IMU noise—are configurable for deployment across a wide range of hardware and operational environments.
References:
- "FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry" (Zheng et al., 2024)
- "FAST-LIVO2 on Resource-Constrained Platforms: LiDAR-Inertial-Visual Odometry with Efficient Memory and Computation" (Zhou et al., 23 Jan 2025)
- "FAST-LIO2: Fast Direct LiDAR-inertial Odometry" (Xu et al., 2021)