Triple Gaussian Splatting: Real-Time Relighting
- Triple Gaussian Splatting is a computational framework that unifies geometry estimation, reflectance modeling, shadow computation, and global illumination into a single differentiable pipeline.
- It uses anisotropic spatial Gaussians with learned view- and illumination-dependent reflectance to achieve real-time novel-view synthesis at 90 fps on commodity GPUs.
- The method significantly outperforms NeRF-based approaches by reducing training times and enabling complex effects like anisotropic specularities and self-shadowing for photorealistic relighting.
Triple Gaussian Splatting (GS³; pronounced "GS cubed") is a computational framework for real-time, physically-based relighting and novel-view synthesis of objects from multi-view One-Light-At-a-Time (OLAT) image sets. GS³ represents a scene via a cloud of anisotropic, spatially-situated Gaussians, each equipped with a learned, view- and illumination-dependent reflectance model combining Lambertian and a mixture of angular Gaussians (“angular splatting”). The method unifies geometry estimation, direct and indirect reflectance, shadow computation, and global illumination effects within a fully differentiable, deferred-shading pipeline. GS³ achieves order-of-magnitude speedup over neural inverse rendering approaches, while rendering complex view- and light-dependent effects such as anisotropic specularities, translucency, and self-shadow at 90 frames per second on commodity GPUs (Bi et al., 2024).
1. Problem Formulation and Motivation
GS³ targets the task of generating photorealistic images of an object under arbitrary views and point-light configurations. Given 500–2,000 multi-view OLAT photographs with known camera and point-light poses, the problem is to learn a scene representation supporting real-time (≈90 fps) photorealistic relighting, including direct illumination, view-dependent effects, self-shadowing, and soft indirect light.
Prior representations exhibit critical limitations: “vanilla” 3D Gaussian Splatting encodes static environment lighting (often using spherical harmonics) and fails on novel light directions or strong view/light effects. Mesh or point-based relighting may require costly ray tracing or precomputed visibilities, and are brittle for translucent or anisotropic geometry. Neural fields (NeRF-derivatives) provide high fidelity but with tens-of-hours training and slow inference (<1 fps). GS³ addresses these issues by simultaneously optimizing geometric, reflectance, shadow, and indirect-light parameters end-to-end via a splatting-based renderer, lowering both training (40–70 minutes) and inference budgets (Bi et al., 2024).
2. Scene Representation and Reflectance Modeling
A GS³ scene comprises anisotropic 3D “spatial Gaussians.” Each spatial Gaussian consists of:
- Position
- Covariance (scaling and rotation )
- Opacity
- Learned reflectance , with outgoing direction and incident light direction in the local shading frame
The spatial density at point is:
Reflectance per Gaussian is split into a diffuse and a specular term:
where are the RGB diffuse and specular albedos, is the learned shading normal.
- Diffuse: Modified Lambertian,
- Specular: Mixture of shared “angular Gaussians” (anisotropic SGs) applied to the half-vector ,
Each basis angular Gaussian is parameterized by an orthonormal frame and widths :
Only the are learned per Gaussian; the basis is shared scene-wide.
3. Triple Splatting Pipeline
GS³ employs a deferred shading strategy with three sequential “splatting” passes per frame:
1. Appearance (Shading) Splatting
- For each spatial Gaussian, is projected to the screen as a 2D ellipse, accumulated as:
with the projected spatial density at the pixel.
2. Shadow Splatting and MLP Refinement
- Each Gaussian is projected in light space to generate a “shadow map.” Opacities along each shadow ray are accumulated, yielding raw visibilities .
- Per-Gaussian visibilities (with learned latent) are processed by a 3-layer, 32-unit-per-layer MLP (), outputting refined shadow values (leaky ReLU activations, sigmoid output).
- The resulting is splatted onto the image grid as the shadow mask.
3. Global Illumination Compensation MLP
- Each spatial Gaussian outputs a residual color via a 3-layer, 128-unit MLP (leaky ReLU + sigmoid).
- All are splatted to produce a global-illumination correction image.
The final image is composed as:
4. Training Protocols and Implementation
- Loss function: Blended and D-SSIM,
with .
- Initialization and optimization:
- Geometry & opacities as in static GS.
- Angular Gaussians: .
- Two-stage schedule: Stage 1 (15k iters) uses only diffuse reflectance, stabilizing normals; Stage 2 (100k iters) enables full model (specular, shadows, residuals).
- Adam optimizer (), learning rates with angular Gaussian parameters decaying to by late training.
- Datasets: NeRF-rendered, OpenSVBRDF, learned-scan, handheld-flash photographs, professional lightstage.
- Compute and resources: 120k–750k spatial Gaussians, angular bases, 40–70 min training on an RTX 4090. Inference at 90 fps (512512), with memory use only modestly above static GS.
5. Quantitative and Qualitative Evaluation
A summary of quantitative results across representative methods on standard relighting metrics (averaged over test views and lights):
| Method | PSNR (↑) | SSIM (↑) | LPIPS (↓) | Runtime |
|---|---|---|---|---|
| Ours (GS³) | 34.2 | 0.93 | 0.07 | 90 fps |
| NRHints [Zeng et al. ’23] | 29.9 | 0.92 | 0.09 | <1 fps |
| NRTF [Lyu et al. ’22] | 30.4 | 0.96 | 0.04 | 0.3 fps |
| OSF [Yu et al. ’23] | 26.1 | 0.94 | 0.05 | ~1 fps |
| GaussianShader [Jiang ’23] | 29.3 | 0.94 | 0.06 | 60 fps |
| GS-IR [Liang ’23] | 29.1 | 0.93 | 0.08 | 60 fps |
| Relightable3DGaussian [Gao ’23] | 30.2 | 0.95 | 0.05 | 60 fps |
| TensoIR [Jin ’23] | 31.7 | 0.96 | 0.05 | 60 fps |
Qualitatively, GS³ reproduces intricate relighting phenomena:
- Furballs and subsurface-scattering cups display convincing self-shadow and translucency
- Strong, highly anisotropic highlights on metallic and textile surfaces, attributed to the angular Gaussian mixture
- Accurate self-shadowing in highly occluded scenes (e.g., LEGO assemblies)
6. Discussion, Ablations, and Limitations
- Angular Gaussians (): suffices for moderate specular lobes; is required for modeling sharp glints. Beyond , gains diminish.
- Shadow MLP Width: Reducing hidden units from 32 to 16 increases shadow noise by . Removal of the MLP leads to visible blockiness and aliasing in shadows.
- Global Illumination MLP: Disabling causes average increase in residual error; removing direct shading entirely leaves indirect components unmodelled.
Strengths:
- Integrates geometry, reflectance, shadowing, and indirect lighting into a single, differentiable, end-to-end optimized pipeline
- Achieves real-time rendering (90 fps) at quality levels on par or superior to NeRF-based relighting, which remains orders of magnitude slower
- Handles challenging cases (translucent, anisotropic, furry) without per-object prior assumptions
Limitations and Future Directions:
- Does not support explicit modeling of fully transparent, refractive materials; suggested extension is a differentiable ray-caster replacing the residual MLP
- The fidelity of shadow edges is constrained by the Gaussian cloud’s resolution; further density control or multiscale splatting modes are possible remedies
- Acquisition burden could be reduced via learned illumination multiplexing strategies
GS³’s hybrid approach—combining flexible per-Gaussian reflectance functions, deferred appearance/shadow/global illumination splatting, and learned MLP corrections—enables photorealistic relighting and novel view synthesis at unprecedented interactive speeds, with broad applicability across digitally scanned and real-world captured objects (Bi et al., 2024).