3D Gaussian Splatting

Updated 25 February 2026

3D Gaussian Splatting is a method that represents 3D scenes using anisotropic, parameterized Gaussian primitives capturing spatial, opacity, and view-dependent radiance.
It employs differentiable optimization with multi-view loss functions to reconstruct high-fidelity geometry and appearance from posed images.
Advanced acceleration and compression techniques enable real-time rendering, efficient GPU integration, and scalable applications in VR and scientific visualization.

3D Gaussian Splatting (3DGS) is a paradigm for explicit, efficient scene representation and differentiable rendering, in which a complex 3D environment is encoded as a sparse set of anisotropic, parameterized 3D Gaussian primitives. Each such primitive acts as a continuous volumetric kernel endowed with spatial, opacity, and appearance parameters, typically including spherical harmonics for view-dependent radiance. 3DGS yields real-time, high-fidelity novel view synthesis, integrates seamlessly into GPU rasterization pipelines, and supports compositional operations such as blending, pruning, and feed-forward inference. The field has recently advanced through methodical developments in rendering, compression, optimization, scalability, and extension to VR, scientific visualization, and specialized domains.

1. Principled Formulation and Rendering Pipeline

In 3DGS, a scene is represented as a set of $N$ anisotropic Gaussians: $G_i = (\mu_i \in \mathbb{R}^3,\, \Sigma_i \in \mathbb{R}^{3\times3},\, \alpha_i \in [0,1],\, \mathbf{c}_i(\cdot)\,)$ with $\mu_i$ as center, $\Sigma_i$ as covariance, $\alpha_i$ opacity, and $\mathbf{c}_i$ a (typically SH-based) view-dependent color function. The density field at $x \in \mathbb{R}^3$ is

$G_i(x) = \exp\big(-\tfrac12 (x-\mu_i)^\top \Sigma_i^{-1}(x-\mu_i) \big)$

and the per-pixel color in novel view synthesis is constructed by projecting each $G_i$ to the image plane as

$\widetilde{\mu}_i = \phi_K(W\mu_i), \qquad \widetilde{\Sigma}_i = J W \Sigma_i W^T J^T$

with $W$ the world-to-camera matrix, $K$ the intrinsics, $J$ the Jacobian at $W\mu_i$ , followed by evaluation of the 2D Gaussian footprint $\widetilde{G}_i$ . Ordered alpha blending—using the non-commutative “over” operator—

$C = \sum_{i=1}^N \mathbf{c}_i \alpha_i \prod_{j<i} (1-\alpha_j)$

is the classic protocol, necessitating either global or per-tile front-to-back depth sorting for correct compositing (Matias et al., 20 Oct 2025, Bao et al., 2024). Accurate rasterization and compositing are implemented via GPU pipelines leveraging instanced quads (point-sprites), tile-based culling, and hardware or compute-based blending.

Explicit support for spherical harmonics-based color, hierarchical splitting, densification, pruning, and regularization (e.g., surface alignment, as-isometric-as-possible losses), enables scene adaptation and quality control across a range of reconstruction regimes (Matias et al., 20 Oct 2025, Qian et al., 2023).

2. Differentiable Optimization and Loss Function Design

3DGS reconstructs geometry and appearance via differentiable loss minimization over multi-view posed images. The core losses are:

Photometric image matching (e.g., $L_1$ , MSE, or SSIM): direct matching between rendered and reference images
Geometry-regularizing terms: depth-normal consistency, multi-view re-projection, and SDF- or mesh-aligned regularization for structure (Wu et al., 2024, Wang et al., 1 Jul 2025)
Positional and shape consistency within Gaussians, often promoting local isometry or smoothness (Qian et al., 2023)

Optimization parameters include Gaussian centroids, covariances, opacities, spherical harmonics coefficients, and sometimes surface anchors; backpropagation is performed through the GPU renderer.

Dynamic control mechanisms, such as per-region density adaptation (Wang et al., 1 Jul 2025) and adaptive densification/pruning (Gao et al., 2 Jan 2025), respond to scene complexity and error signals, yielding models that allocate Gaussians efficiently to high-detail areas or prune redundancies without significant perceptual loss (Zhang et al., 2024, Lee et al., 21 Mar 2025). Learning-based initialization accelerates convergence and improves stability in complex or poorly textured scenes (Wang et al., 1 Jul 2025, Gao et al., 2 Jan 2025).

3. Rendering Acceleration and Architectural Innovations

Despite the sparsity of the 3DGS representation, rendering of millions of Gaussians per frame motivates a wide array of acceleration strategies:

Sort-Free Weighted Sum Rendering (WSR): Replaces order-dependent alpha blending with a commutative weighted sum, removing the need for sorting and supporting a single hardware blending pass. Several weighting schemes (Direct, Exponential, Linear-Correction) trade off physical exactness for practical speed and artifact removal, e.g.,

$C = \frac{c_B w_B + \sum_i c_i \alpha_i w(d_i)}{w_B + \sum_i \alpha_i w(d_i)}$

This yields a $1.23\times$ average speedup on mobile GPUs, with fewer temporal artifacts and lower memory due to reduced Gaussian count (Hou et al., 2024).

Tensor Core Acceleration (TC-GS): Maps per-pixel, per-Gaussian exponentiation into matrix-multiplies on TCUs. By reformulating

$\beta_i^{(j)} = v_0^{(j)}+v_1^{(j)} x_i+v_2^{(j)} y_i+\dots$

as $B = U^\top V$ , massive speedups over kernel-invocation baselines are achieved. Mixed-precision computation is preserved via a global-to-local coordinate transform that prevents FP16 overflow. End-to-end, this allows up to $5.6\times$ acceleration, without quality loss (Liao et al., 30 May 2025).

Hardware/Software Co-design (Streaming 3DGS): For high-frame-rate, resource-constrained deployment, algorithms such as tile-based viewpoint warping and workload prediction allow sparse updates leveraging frame continuity. Customized accelerators implement culling and mapping logic, improving rasterization-core utilization from $51.5\%$ to $88.6\%$ and yielding up to $17.3\times$ end-to-end speedup on edge devices (Wei et al., 29 Jul 2025).
Foveated and Hierarchical Rasterization for VR: Addressing the particular challenges of wide-FOV, low-latency VR, combined methods such as Mini-Splatting (model compaction), StopThePop (stable per-pixel resorting), and Optimal Projection reduce $N$ by $10\times$ , eliminate popping/god-ray artifacts, and achieve $>72$ FPS at $2$K+ per-eye resolution (Tu et al., 15 May 2025).

4. Compression, Pruning, and Memory-Efficient Representation

Given the linearly scaling memory and bandwidth demands, dedicated compression frameworks have become central in 3DGS:

Tri-plane Compression: Gaussian attributes are encoded via features sampled from tri-planes, lending spatial coherence and facilitating KNN-based entropy coding, position-sensitive decoding, and adaptive wavelet loss. This reduces storage by $100\times$ over vanilla 3DGS while matching LPIPS/PSNR (Wang et al., 26 Mar 2025).
Minimal Gaussians and Learned Pruning: Aggressive pruning guided by importance and local distinctiveness (and differentiable Gumbel-Sigmoid masking) yields up to $70\%$ reduction in primitives with negligible perceptual loss, doubling rendering speed and strongly reducing memory (Zhang et al., 2024, Lee et al., 21 Mar 2025).
Streaming and Hierarchical Models: Multi-scale (Scale-GS) or anchor-based (Scaffold-GS) adaptive frameworks coordinate coarser/finer Gaussians and prioritize dynamic/complex regions via bidirectional masking, gradient-aware spawning, and hybrid deformation, supporting efficient streaming and frame-wise updates (Yang et al., 29 Aug 2025).
Quality Evaluation: Systematic IQA benchmarks, such as 3DGS-IEval-15K, uncover that most existing metrics underserve geometry-induced distortions unique to 3DGS; specialized geometry-aware, view-ray-consistent metrics are recommended for future development (Xing et al., 17 Jun 2025).

5. Applications, Extensions, and Domain-Specific Adaptations

3DGS supports an expanding ecosystem of applications, with explicit attention to domain needs:

Surface and Large-Scale Reconstruction: Extensions for aerial/LiDAR-driven mapping implement spatial chunking, chunked training, and ray-Gaussian intersection for depth and normals extraction, demonstrated to match or surpass MVS and open-source benchmarks on urban and aerial datasets (Wu et al., 2024).
Animatable Avatars and Dynamic Scenes: Non-rigid deformation MLPs, surface-aligned optimization, and explicit LBS-based pose control enable real-time, photorealistic human avatars, with training times reduced by orders of magnitude relative to NeRF (Qian et al., 2023, Wang et al., 1 Jul 2025). Hierarchical or spatio-temporal Gaussian trajectories address dynamic and 4D capture.
Image-Based and Physics-Aware Splatting: Hybrid schemes (IBGS) synthesize high-frequency detail and complex view-dependent effects (e.g., specularities) via residuals aggregated from neighboring source images, achieving SOTA quality with $40$– $60\%$ the storage cost of prior methods (Nguyen et al., 18 Nov 2025). In physically challenging contexts (e.g., underwater), learnable attenuation/backscatter and uncertainty-aware pruning combine to disentangle media effects and suppress floating artifacts (Xing et al., 8 Aug 2025).
Scientific Visualization and Distributed Rendering: By partitioning the scene, distributing training and rendering across nodes and GPUs with ghost region exchange and background masking, 3DGS enables high-resolution HPC-scale visualization with near-linear scaling and artifact-free boundaries (Han et al., 15 Sep 2025).

6. Analysis of Limitations and Trajectories for Future Research

Despite rapid progress, 3DGS continues to face research and engineering challenges:

Rendering Artifacts: While sort-free weighted-sum approaches eliminate popping and sorting-induced discontinuities, “early termination” (efficient occlusion culling) is not natively available, posing a challenge for extremely large $N$ (Hou et al., 2024). Depth-related floating or flickering artifacts persist under extreme occlusion unless advanced per-pixel resorting or ray-based methods are adopted (Tu et al., 15 May 2025, Matias et al., 20 Oct 2025).
Scalability: Memory and workload trade-offs for interactive, large-scale, or long-horizon streaming require advances in redundancy elimination, anchor/voxel-clustering, and federated region-based learning (Yang et al., 29 Aug 2025, Wei et al., 29 Jul 2025, Bao et al., 2024).
Generalization and Accessories: Generative, pose-agnostic, and physics-augmented paradigms—such as BRDF augmentation, relighting, semantic and semantic editing, and hybrid neural field integration—remain active targets (Matias et al., 20 Oct 2025, Bao et al., 2024).
Quality and Metric Limitations: Current full-reference and deep NR-IQA metrics are suboptimal for geometry-driven artifacts common to 3DGS compression; geometry- or ray-consistency-aware IQA is needed (Xing et al., 17 Jun 2025).
Integration and Hardware: Exploration of moment-based or layered weighted sum rendering, adaptive “early-exit” culling, per-pixel hardware termination logic, and standardized hardware-software co-design frameworks will dictate progress in commodity and edge deployment (Hou et al., 2024, Liao et al., 30 May 2025, Wei et al., 29 Jul 2025).

7. Comparative Evaluation and Performance Summary

The empirical performance of modern 3DGS approaches is quantitatively established as follows (selected metrics):

Model/Class	PSNR (Mip-NeRF360)	SSIM	LPIPS	Gaussians (M)	Storage (MB)	Render FPS	Main Features
Vanilla 3DGS	27.21	0.815	0.214	~3.98	764–431	30–60	Sorted alpha blend, explicit SH
LC-WSR (Sort-Free)	27.19	0.804	0.211	~2.88	~63% of 3DGS	1.23×	Weighted sum rendering, no popping
IBGS (Image-Based)	28.33	0.837	0.186	~1.59	291	<3DGS	Residual from images, fine detail
Tri-plane Compression (TC-GS)	23.96 (TanksTrain)	0.843	0.115	–	7.66	–	Context-aware tri-plane, entropy
OMG (Minimal Gaussians)	27.21	0.814	–	0.56	5.31	298–600+	Distinctness-driven pruning, quantization
3DGS-Avatar	30.6 (ZJU-MoCap)	0.977	0.020	0.05	–	>50	Deformable, AIAP reg, LBS, <0.5 h train

These results demonstrate that advanced pruning, compression, and rendering reforms can reduce storage and compute costs by 10–50× or more with negligible or even improved quality and enable real-time, high-resolution rendering across diverse domains (Zhang et al., 2024, Wang et al., 26 Mar 2025, Qian et al., 2023, Tu et al., 15 May 2025).

3D Gaussian Splatting represents a confluence of tractable explicit modeling, high-throughput differentiable rendering, and versatile scene parameterization. Continuing methodological advances—spanning rendering accuracy, computational scalability, pruning, streaming, and domain adaptation—underscore its central role in the evolving landscape of real-time computer graphics, novel view synthesis, and beyond. Key research directions involve principled metric development, integration of richer physical/material priors, hierarchical and federated training, and hardware–software co-design for universal deployment (Bao et al., 2024, Hou et al., 2024, Liao et al., 30 May 2025, Xing et al., 17 Jun 2025, Xing et al., 8 Aug 2025).