DOGS: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus (2405.13943v2)

Published 22 May 2024 in cs.CV

Abstract: The recent advances in 3D Gaussian Splatting (3DGS) show promising results on the novel view synthesis (NVS) task. With its superior rendering performance and high-fidelity rendering quality, 3DGS is excelling at its previous NeRF counterparts. The most recent 3DGS method focuses either on improving the instability of rendering efficiency or reducing the model size. On the other hand, the training efficiency of 3DGS on large-scale scenes has not gained much attention. In this work, we propose DoGaussian, a method that trains 3DGS distributedly. Our method first decomposes a scene into K blocks and then introduces the Alternating Direction Method of Multipliers (ADMM) into the training procedure of 3DGS. During training, our DOGS maintains one global 3DGS model on the master node and K local 3DGS models on the slave nodes. The K local 3DGS models are dropped after training and we only query the global 3DGS model during inference. The training time is reduced by scene decomposition, and the training convergence and stability are guaranteed through the consensus on the shared 3D Gaussians. Our method accelerates the training of 3DGS by 6+ times when evaluated on large-scale scenes while concurrently achieving state-of-the-art rendering quality. Our code is publicly available at https://github.com/AIBluefisher/DOGS.

Authors (2)

Yu Chen (506 papers)
Gim Hee Lee (135 papers)

Citations (10)

View on Semantic Scholar

Summary

Accelerating 3D Gaussian Splatting for Large-Scale Scenes with DoGaussian

Introduction

Large-scale 3D scene reconstruction has been an evolving field with significant improvements in recent years, and 3D Gaussian Splatting (3DGS) has showcased promising results in generating high-fidelity renderings. However, when working with extensive scenes like cityscapes, traditional 3DGS methods face challenges related to training time and GPU memory consumption.

The paper proposes a method called DoGaussian, which introduces a distributed approach to training 3DGS, leveraging scene decomposition and distributed consensus to address these issues. Let's break down how this innovative method works and what it means for the future of 3D scene reconstruction.

The Challenges

3D Gaussian Splatting (3DGS) is a technique that encodes scenes into sets of 3D Gaussians—each represented with a covariance matrix, center position, opacity, and latent features. While efficient, it demands significant GPU memory and time to process large-scale scenes due to the sheer number of 3D Gaussians required.

Key Challenges:

High GPU Memory Usage: Training on large scenes requires holding numerous 3D Gaussians in memory, leading to potential capacity issues.
Long Training Times: Large-scale scenes inherently involve more data, contributing to prolonged training periods.

The DoGaussian Approach

To tackle these problems, DoGaussian employs a distributed training paradigm using the Alternating Direction Method of Multipliers (ADMM). The method decomposes the scene into manageable blocks and maintains a global 3DGS model that is synchronized across all compute nodes.

Steps in DoGaussian:

Scene Decomposition:
- The scene is split recursively into blocks, ensuring each block is of a similar size.
- This decomposition happens along the axis with the longest span to maintain balance.
Distributed Training:
- Each block is trained separately in parallel (distributedly) on different nodes.
- A global model is maintained and updated using the ADMM consensus method, ensuring consistency across blocks.
Consensus Step:
- After each training iteration, local 3D Gaussians are gathered and averaged into the global model.
- The updated global model is then shared with all nodes for the next iteration.
Inference:
- Post-training, only the global model is retained for rendering, significantly reducing inference time and memory use.

Numerical Results

The paper highlights substantial improvements in terms of both training speed and rendering quality. Specifically, they report a 6+ times reduction in training time while achieving state-of-the-art rendering quality. Here's a look at the key results:

Training Time Reduction: Compared to the original 3DGS method, DoGaussian substantially cuts down the training duration.
Rendering Quality: Metrics like PSNR, SSIM, and LPIPS showed significant improvement, indicating enhanced image and depth quality.

Here's a summary table from the paper illustrating the performance:

| Method | PSNR (higher better) | SSIM (higher better) | LPIPS (lower better) | ||-|-|-| | Mega-NeRF | 22.08 - 25.60 | 0.547 - 0.770 | 0.312 - 0.636 | | Switch-NeRF | 21.54 - 26.51 | 0.541 - 0.795 | 0.271 - 0.616 | | 3DGS | 24.13 - 25.51 | 0.688 - 0.791 | 0.214 - 0.347 | | VastGaussian | 22.64 - 23.82 | 0.695 - 0.761 | 0.225 - 0.261 | | DoGaussian | 24.01 - 25.78 | 0.681 - 0.804 | 0.204 - 0.257 |

Practical and Theoretical Implications

1. Practical Uses:

Faster Training: Practical for industries needing quick turnaround on large-scale 3D reconstructions, such as urban planning and game development.
Resource Efficiency: Reduced memory footprint makes it feasible on more modest hardware configurations.

2. Theoretical Impact:

Distributed Training Models: Showcases an effective implementation of distributed consensus algorithms in the 3D modeling domain.
Future Research: Paves the way for further optimizations in training efficiencies and distributed computing methods in deep learning models for graphics.

Future Directions

1. Enhanced Scene Splitting: Investigating more sophisticated splitting algorithms could balance load even more effectively, minimizing communication overhead and improving training speed further. 2. Dynamic Resource Allocation: Adapting the method to dynamically allocate resources based on scene complexity, potentially integrating with elastic cloud resources. 3. Broader Applications: Expanding beyond 3D scene reconstruction to other domains requiring large-scale spatial processing, like autonomous vehicle simulations, could be beneficial.

Conclusion

DoGaussian addresses significant bottlenecks in large-scale 3D Gaussian Splatting, providing both theoretical and practical advancements. By leveraging distributed training and scene consensus, we now have a method that not only accelerates training but also maintains high-quality rendering, marking an important step forward in the field of 3D scene reconstruction.

PDF Markdown

Related Papers

Tweets

https://twitter.com/janusch_patas/status/1793860730783142203

https://twitter.com/zhenjun_zhao/status/1793916796392882536