DiffGS: Functional Gaussian Splatting Diffusion (2410.19657v2)

Published 25 Oct 2024 in cs.CV

Abstract: 3D Gaussian Splatting (3DGS) has shown convincing performance in rendering speed and fidelity, yet the generation of Gaussian Splatting remains a challenge due to its discreteness and unstructured nature. In this work, we propose DiffGS, a general Gaussian generator based on latent diffusion models. DiffGS is a powerful and efficient 3D generative model which is capable of generating Gaussian primitives at arbitrary numbers for high-fidelity rendering with rasterization. The key insight is to represent Gaussian Splatting in a disentangled manner via three novel functions to model Gaussian probabilities, colors and transforms. Through the novel disentanglement of 3DGS, we represent the discrete and unstructured 3DGS with continuous Gaussian Splatting functions, where we then train a latent diffusion model with the target of generating these Gaussian Splatting functions both unconditionally and conditionally. Meanwhile, we introduce a discretization algorithm to extract Gaussians at arbitrary numbers from the generated functions via octree-guided sampling and optimization. We explore DiffGS for various tasks, including unconditional generation, conditional generation from text, image, and partial 3DGS, as well as Point-to-Gaussian generation. We believe that DiffGS provides a new direction for flexibly modeling and generating Gaussian Splatting.

References (86)

Authors (3)

Junsheng Zhou (28 papers)
Weiqi Zhang (21 papers)
Yu-Shen Liu (79 papers)

Citations (7)

View on Semantic Scholar

Summary

Analysis of "DiffGS: Functional Gaussian Splatting Diffusion"

The paper "DiffGS: Functional Gaussian Splatting Diffusion" presents an advanced approach to the generation of 3D Gaussian Splatting (3DGS), a representation known for its real-time rendering capabilities and potential for high-fidelity visual output. The authors propose DiffGS, a novel model that leverages latent diffusion to address the inherent challenges of unstructured and discrete Gaussian Splatting.

Core Contributions

Functional Representation: DiffGS introduces a unique method of representing Gaussian Splatting through three disentangled functions: the Gaussian Probability Function (GauPF), Gaussian Color Function (GauCF), and Gaussian Transform Function (GauTF). This representation allows for continuous modeling of 3DGS, overcoming the limitations posed by its discrete nature.
Generative Framework: The authors propose a Gaussian Variational Auto-Encoder (VAE) coupled with a Latent Diffusion Model (LDM) to generate these continuous functions. The VAE encodes 3DGS into latent vectors, while the LDM learns to generate new 3D shapes in this latent space, enabling both unconditional and conditional generation.
Discretization Algorithm: An innovative octree-guided sampling and optimization algorithm is introduced. This method allows for efficient geometry extraction from generated Gaussian probabilities, providing a scalable way to generate Gaussians at arbitrary resolutions.

Empirical Evaluation

The researchers demonstrate the efficacy of DiffGS across several tasks:

Unconditional Generation: Tested on ShapeNet's airplane and chair classes, DiffGS surpasses existing methods such as GET3D and DiffTF regarding FID and KID metrics.
Conditional Generation: The model shows strong results in conditional generation based on text, images, and partial 3DGS inputs, further illustrating its versatility and applicability in different contexts.
Point-to-Gaussian Generation: Tested on ShapeNet and DeepFashion3D datasets, DiffGS effectively translates point cloud data into high-quality Gaussian primitives.

Implications and Speculation

The functional approach of DiffGS opens new avenues in 3D content generation by fostering more flexible and efficient modeling. Practically, this could enhance tools for virtual reality, game development, and film production, where real-time rendering and high-quality visualization are critical. Theoretically, the disentangled functional representation might inspire further research in continuous modeling techniques for inherently discrete data.

Considering future applications in AI and related fields, DiffGS's method of leveraging diffusion models could pave the way for more adaptive and robust 3D generative frameworks. The seamless integration with existing 2D and 3D data generators could also enhance cross-domain synthesis capabilities.

Conclusion

DiffGS offers a substantial contribution to the field of 3D generative modeling. By efficiently marrying the strengths of diffusion models with a novel representational schema, it sets a strong precedent for future research and application in graphics and beyond. The model's ability to accommodate varying granularities of Gaussian primitives without sacrificing quality or computational efficiency makes it a robust tool for both academic exploration and practical deployment.

PDF Markdown

Tweets

https://twitter.com/janusch_patas/status/1850812208696983867