OmniSplat: Real-Time 3D Gaussian Splatting
- OmniSplat is a feed-forward 3D Gaussian Splatting framework that generates 3D radiance fields from sparse omnidirectional images in a single neural inference step.
- It employs a Yin–Yang grid decomposition to counter non-uniform sampling and uneven Gaussian distribution challenges inherent in 360° imagery.
- By leveraging pre-trained perspective CNNs, OmniSplat achieves real-time, editable 3D scene acquisition without the need for per-scene optimization.
OmniSplat is a feed-forward 3D Gaussian Splatting (3DGS) framework developed to directly generate 3D radiance field representations from sparse omnidirectional (360°) images in a single neural inference step, eliminating the requirement for per-scene optimization and specialized omnidirectional training data. The system leverages the Yin–Yang grid decomposition to circumvent the geometric and sampling challenges inherent to omnidirectional images, enabling the reuse of convolutional neural networks (CNNs) pre-trained on perspective datasets and facilitating real-time, editable 3D scene acquisition (Lee et al., 2024).
1. Feed-Forward 3D Gaussian Splatting and Omnidirectional Challenges
Feed-forward 3D Gaussian Splatting systems such as PixelSplat and MVSplat take a small number of perspective (pinhole) image inputs and directly predict a point-based radiance field, where each point is parameterized by a mean position , covariance , opacity , and a set of spherical harmonics color coefficients . These methods are an order of magnitude faster than iterative per-scene optimization pipelines (e.g., ODGS), as they avoid computationally-intensive refinement.
Adapting feed-forward 3DGS architectures to omnidirectional imagery, such as equirectangular or fisheye images, introduces two critical issues:
- Non-uniform Sampling: Equirectangular projections sample latitude (near the poles) more densely than the equator, producing feature distortions when processed with standard CNNs.
- Uneven Gaussian Distribution: Predicting 3DGS parameters from these distorted features yields point clouds denser at the poles and sparser at the equator. The result is "stripe" and "dot" artifacts during novel-view synthesis due to over- and under-sampling, respectively.
2. Yin–Yang Grid Decomposition
To mitigate the domain shift between perspective and omnidirectional images, OmniSplat employs the “Yin–Yang” overset grid decomposition. The sphere is partitioned into two overlapping, quasi-uniform patches: the Yin patch and the Yang patch.
2.1 Mathematical Formulation
Let define latitude and longitude on the sphere (, ).
- Yin Patch: ,
- Yang Patch: Remainder, obtained by a rotation 0 about the X-axis, with
1
Each patch is mapped onto a square grid via orthographic reparameterization, minimizing distortion and producing a near-uniform sampling density analogous to perspective images.
2.2 Pixel-to-Sphere Transformation
- For a Yin patch pixel 2:
3
4
- For the Yang patch, 5 is first applied to 6, and the resulting vector is mapped back to 7 for grid sampling.
This decomposition enables the direct application of unmodified perspective-based CNN architectures to omnidirectional content.
3. Network Architecture and Processing Pipeline
OmniSplat’s workflow is structured as follows:
- Image Decomposition: Two omnid