Extreme Compression of 3D Gaussian Splatting

Updated 26 February 2026

3D Gaussian Splatting represents scenes via millions of anisotropic Gaussians encoding geometric, photometric, and semantic details.
Extreme compression techniques reduce model sizes by one to three orders of magnitude while maintaining high rendering fidelity and real-time performance.
Advanced methods leverage hierarchical, plane-based, and diffusion models to optimize rate-distortion trade-offs for immersive media and autonomous applications.

3D Gaussian Splatting (3DGS) is a high-fidelity, efficient scene representation wherein a scene is parameterized by a set of 3D anisotropic Gaussians, each with geometric, photometric, and sometimes semantic attributes. This paradigm is central to real-time neural rendering, immersive media, and autonomous driving applications, but its practical deployment is hampered by the massive storage required for millions of Gaussians per scene. Extreme compression methods accordingly seek to reduce 3DGS model sizes by one to three orders of magnitude, maintaining rendering fidelity and real-time performance while minimizing transmission and memory costs.

1. 3DGS Structure and Compression Bottlenecks

A 3DGS scene comprises $N$ Gaussians $G_i$ , each characterized by:

Mean $\mu_i \in \mathbb{R}^3$ (position)
Covariance $\Sigma_i = R_i S_i S_i^\top R_i^\top \in \mathbb{R}^{3 \times 3}$ , with rotation $R_i \in SO(3)$ and principal deviations $S_i = \text{diag}(s_{i,1}, s_{i,2}, s_{i,3})$
RGB color $c_i \in \mathbb{R}^3$ and view-dependent opacity $\alpha_i \in [0,1]$
Additional attributes may include spherical harmonic coefficients or material descriptors.

Rendering is performed by projecting each 3D Gaussian onto the image plane (“splatting”) and compositing using $\alpha$ -blending.

The chief compression challenge is the large bitstream overhead imposed by explicit storage of high-precision means, covariances, and attributes for millions of Gaussians. This issue is exacerbated by the unique global sparsity and local density of Gaussian clouds compared to conventional mesh or LiDAR point clouds, and by the heterogeneity of the attribute distributions.

2. Point Cloud Geometry Compression Approaches

Direct adoption of standard point cloud codecs (e.g., MPEG G-PCC) yields suboptimal compression, as these tools are not tailored to the fractal, mixed-sparsity spatial structure of Gaussian point clouds. To address this, AI-based geometry compression methods such as GausPcgc introduce learned occupancy-code frameworks, hierarchical voxelization, and data-driven context modeling (Wang et al., 21 May 2025).

Key elements include:

Multi-scale octree occupancy prediction using sparse 3D convolutions and non-uniform bit grouping across multiple hierarchical levels.
A dedicated dataset (GausPcc-1K) exhibiting the relevant local density/fractal dimension statistics for training.
Lossless compression (w.r.t. quantization) of $\mu_i$ with state-of-the-art bits-per-point (bpp), outperforming G-PCC by 8.2% (13.27 bpp vs. 14.46), and $G_i$ 0 faster encode/decode.

These geometry-only compressors are plug-in replacements for existing pipelines and preserve rendering precision for the quantization grid. Geometry bits can be reduced by 10–20% without changing the attribute compression, which remains an open extension (e.g., Gaussian attribute compression, GausPcac).

3. Attribute and Full-Pipeline Compression

Extreme 3DGS compression requires aggressive reduction of redundancy in both geometry and attribute streams. This includes anchor-based, vector-quantized, codebook-based, and hybrid predictive architectures. Notable schemes are listed below.

Method	Compression Principle	Rate reduction	PSNR drop
HAC++	Hash-grid context, adaptive quantization, intra-anchor context, adaptive masking	>100× (vs. vanilla), >20× (vs. Scaffold) (Chen et al., 21 Jan 2025)	negligible/improved
CompGS++	Spatial/temporal anchor-coupled prediction, learned entropy model, scalar quantization, motion-residual grids, rate-constrained optimization (Liu et al., 17 Apr 2025)	Up to 370× (static), >100× (dynamic)	≤0.1 dB
Smol-GS	Huffman-coded occupancy octree for $G_i$ 1, learned/quantized splat-wise features decoded by small MLPs (Wang et al., 30 Nov 2025)	4–6 MB models; ≈100× reduction	≤0.1 dB
Feed-forward codecs (FCGS, LocoMoco)	Rapid, scene-agnostic compression, multi-path entropy modules, long-range context, Morton sorting (Chen et al., 2024, Liu et al., 30 Nov 2025)	>20×	<1 dB
Attribute pruning+mixed precision (FlexGaussian)	Attribute-discriminative/partial pruning, per-channel quantization, seconds-level adaptation (Tian et al., 9 Jul 2025)	94.9–96.4% reduction	<1 dB
Noise-substituted VQ (NSVQ-GS)	Separate codebooks for attributes, noise substitution for differentiability, full-precision positions (Wang et al., 3 Apr 2025)	~45×	≤0.2 dB

Lossy-to-lossless attribute compression may combine adaptive quantization (HEMGS), context/hyperprior/auto-regressive entropy modeling, and channel-/instance-adaptive prediction to balance rate and distortion at any operating point (Liu et al., 2024). Strongest hybrid approaches at present reach 100× compression at marginal or zero drop in rendering PSNR and even improved LPIPS/SSIM on public benchmarks.

4. Hierarchical, Plane-Based, and Symmetry-Aware Architectures

Recent work leverages hierarchical and geometric priors to expose spatial/structural redundancy at multiple representation levels:

Hierarchical schemes (HGSC): Pruning guided by global and local importance, octree-coded $G_i$ 2, KD-tree block partitioning, anchor/non-anchor LoD prediction (with RAHT/VQ/residuals) (Huang et al., 2024).
Tri-plane and feature-plane factorization: Representing per-Gaussian attributes via bilinearly sampled 2D planes with per-channel entropy weighting, with planes encoded using standard video codecs (e.g., HEVC) in the frequency domain. This achieves 60–150× compression with $G_i$ 3 dB PSNR loss (Lee et al., 6 Jan 2025, Wang et al., 26 Mar 2025).
Symmetry-aware (SymGS): Detection of reflective symmetries (dominant mirror planes), pruning redundant halves, and joint optimization. This delivers average 108× compression (up to 256× for large-scale scenes) and integrates seamlessly with existing quantization back-ends (Gupta et al., 17 Nov 2025).

Such approaches systematically remove both structured (symmetry, plane-factorizable) and local (similarity, attribute redundancy) information, maximizing achievable compression ratios with task- or perception-driven quality preservation.

5. Diffusion-Based and Generative Restoration for Sublinear Rates

At the extreme low-rate regime (down to 0.1 MB/scene, 1000× compression), rendering quality degrades irreversibly under classical pruning or quantization. Data-driven restoration via diffusion priors thus becomes essential:

ExGS (Chen et al., 29 Sep 2025): Combines Universal Gaussian Compression (UGC, aggressive feed-forward pruning and quantization) with GaussPainter, a mask-guided one-step diffusion denoiser. Mask embeddings guide restoration to preserve high-confidence pixels and inpaint missing regions, enabling 107–352× reductions (to ≈3 MB) with PSNR/SSIM/LPIPS outperforming prior feed-forward and generative inpainting baselines.
NiFi (Eteke et al., 4 Feb 2026): Integrates severe 3DGS artifact synthesis (pruning+quantization, entropy coding) with artifact-aware, single-step latent diffusion. A frozen Stable Diffusion 3 U-Net backbone is augmented with LoRA adapters (restoration+prior). KL and perceptual losses with restoration-prior score matching yield state-of-the-art recovery of high-frequency detail at compression rates up to 1,000× (0.1 MB) and LPIPS $G_i$ 40.18–0.27 (on par with uncompressed models).

Diffusion-guided inpainting approaches outperform classical methods and direct inpainting by leveraging view/context-aware mask conditioning and score-matching, essential to closing the gap between aggressive transmission rates and photorealistic rendering.

6. Implementation Complexity and Practical Trade-offs

Compression methods differ significantly in algorithmic and computational complexity:

AI-based geometry coders (GausPcgc) execute in $G_i$ 5s/Mpoint (encoding), while classical octree codecs or attention-based geometry models (Octattention) incur up to $G_i$ 6 higher decode time (Wang et al., 21 May 2025).
Attribute pruning/codebook and plane-based models (Smol-GS, FlexGaussian) can be applied entirely post hoc or with lightweight retraining, facilitating mobile/edge deployment.
Diffusion- and VAE-based inpainting, while requiring GPU acceleration and moderate VRAM (e.g., 8 GB for ExGS), can operate interactively at 65 ms/frame (Chen et al., 29 Sep 2025, Eteke et al., 4 Feb 2026).
Hierarchical, symmetry, and multi-level LoD models (HGSC, SymGS) add minimal decode-time overhead, with all mirror/LoD/anchor information precomputed and baked into the bitstream.

Extreme compression comes with rate–distortion trade-offs that can be tuned at encode time. For example, Smol-GS parameter $G_i$ 7 or codebook size in NSVQ-GS can be adjusted to match bitrate or PSNR/LPIPS requirements (Wang et al., 30 Nov 2025, Wang et al., 3 Apr 2025). The empirical Pareto frontier is determined by the combined choice of pruning aggressiveness, quantization depth, plane/channel bit-allocation, and restoration model.

7. Outlook, Extensions, and Open Problems

Ongoing research in extreme 3DGS compression points toward several future directions:

Learned lossy attribute and geometry compression that jointly optimizes rate–distortion across structured and unstructured components.
Hybrid streaming/transmission architectures that exploit on-demand adaptation, progressive/region-of-interest transmission, and semantic-driven fidelity.
Extensions to dynamic and temporal scenes, as in CompGS++ and recent equivariant models, necessitating predictive coding across time and efficient inter-frame redundancy removal.
Enhanced scene understanding: compressed representations (Smol-GS's discrete tokens or plane-codes) can serve as compact substrates for downstream 3D vision, robotics, and planning tasks.
Generative and diffusion-based refinement: direct latent-editing and back-projection onto 3D structures remain open challenges for bridging extreme compression with interactive, editable, or semantic-aware 3D representations (Eteke et al., 4 Feb 2026, Chen et al., 29 Sep 2025).

Currently, hybrid approaches integrating spatial, attribute, and generative priors with arithmetic or context-based coding define the state-of-the-art in balancing size, quality, and computational practicality for extreme compression of 3D Gaussian Splatting (Wang et al., 21 May 2025).