Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment (2112.01349v3)

Published 2 Dec 2021 in cs.CV

Abstract: Large-scale Bundle Adjustment (BA) requires massive memory and computation resources which are difficult to be fulfilled by existing BA libraries. In this paper, we propose MegBA, a GPU-based distributed BA library. MegBA can provide massive aggregated memory by automatically partitioning large BA problems, and assigning the solvers of sub-problems to parallel nodes. The parallel solvers adopt distributed Precondition Conjugate Gradient and distributed Schur Elimination, so that an effective solution, which can match the precision of those computed by a single node, can be efficiently computed. To accelerate BA computation, we implement end-to-end BA computation using high-performance primitives available on commodity GPUs. MegBA exposes easy-to-use APIs that are compatible with existing popular BA libraries. Experiments show that MegBA can significantly outperform state-of-the-art BA libraries: Ceres (41.45$\times$), RootBA (64.576$\times$) and DeepLM (6.769$\times$) in several large-scale BA benchmarks. The code of MegBA is available at https://github.com/MegviiRobot/MegBA.

Citations (7)

Summary

  • The paper introduces MegBA, a GPU-based distributed library that partitions BA problems to overcome traditional memory and computation limits.
  • It applies distributed PCG and Schur Elimination algorithms alongside SIMD-optimized operations for efficient 3D vision processing.
  • Experimental results demonstrate up to 41× and 64× speed improvements over leading libraries, highlighting its scalability on multi-GPU systems.

MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment

The paper introduces MegBA, a GPU-based distributed library designed to address the computational and memory demands of large-scale Bundle Adjustment (BA). BA is a critical component for 3D vision applications such as structure-from-motion and simultaneous-localization-and-mapping. These tasks involve minimizing the re-projection error between camera poses and map points, typically achieved through iterative optimization methods.

Core Contributions and Methodology

MegBA's key contribution lies in its ability to partition and distribute large BA problems across multiple GPUs, thus providing substantial aggregated memory and computational throughput. The paper identifies the limitations of existing BA libraries, which typically focus on single-node execution and rely heavily on CPU architectures, failing to utilize the full potential of GPUs.

MegBA introduces several innovations:

  1. Distributed BA Algorithms: MegBA partitions the BA graphs based on edges, ensuring equitable workload distribution among GPUs. It employs distributed Precondition Conjugate Gradient (PCG) and Schur Elimination algorithms to synchronize solver states across nodes, ensuring precision akin to single-node solutions.
  2. GPU-Optimized Computation: The library thoroughly capitalizes on GPU compute capabilities by implementing operators as Single-Instruction-Multiple-Data (SIMD) operations. The data structure used, JetVector, is designed to store BA data in SIMD-friendly vectors, minimizing data transfer costs between CPUs and GPUs.
  3. Extensible API: The APIs offered by MegBA are designed for ease of use and compatibility with existing BA libraries like g2o and Ceres, facilitating easy integration for current users.

Experimental Evaluation and Results

Experimental results presented in the paper are compelling, showcasing MegBA's performance advantage over state-of-the-art libraries such as Ceres, RootBA, and DeepLM. For large-scale datasets like Final-13682, MegBA achieves performance improvements of up to 41.45× and 64.576× compared to Ceres and RootBA, respectively. The scalability is further demonstrated with experiments using up to eight NVIDIA V100 GPUs, where significant reductions in processing time were observed.

Implications and Future Directions

The implications of this work are significant for the development of large-scale 3D mapping systems, especially relevant for city-level high-definition maps and real-time applications like autonomous driving. By leveraging GPU parallelism and distributed computing, MegBA addresses the memory and computational constraints that traditionally limited the scale of problems that could be efficiently addressed using BA.

Looking forward, the potential to extend MegBA's framework to emerging hardware accelerators and integrating hybrid precision computing might offer further improvements in speed and memory efficiency. Additionally, exploring dynamic workload balancing and adaptive precision techniques could lead to optimizations tailored to specific application requirements.

In conclusion, MegBA represents a significant step forward in the development of efficient, scalable solutions for large-scale BA problems, leveraging distributed GPU resources in a manner that provides substantial gains in both computational efficiency and memory utilization. As large-scale 3D vision tasks become increasingly prevalent, libraries like MegBA will be pivotal in pushing the boundaries of what is computationally feasible.

Github Logo Streamline Icon: https://streamlinehq.com