NSVQ for 3DGS Compression
- NSVQ is a differentiable quantization framework that compresses 3D Gaussian Splatting scenes by jointly learning discrete attribute codebooks using a noise substitution mechanism.
- It selectively preserves high-precision attributes while compressing others via separate codebooks, achieving up to a 45× reduction in memory with minimal rendering quality loss.
- NSVQ enables efficient gradient flow, integrates seamlessly with standard 3DGS pipelines, and boosts rendering speed for bandwidth- and latency-sensitive applications.
Noise-Substituted Vector Quantization (NSVQ) is a differentiable quantization framework introduced for compressing 3D Gaussian Splatting (3DGS) scene representations. 3DGS relies on millions of “splats” (anisotropic 3D Gaussians) parameterized by high-dimensional float attributes, which results in prohibitive storage requirements—typically around 1 GB per scene. NSVQ addresses this limitation by jointly learning discrete attribute codebooks and attribute assignments while preserving end-to-end differentiability using a noise-injection mechanism. This permits substantial memory reduction with minimal loss in rendering quality and guarantees compatibility with standard 3DGS pipelines (Wang et al., 3 Apr 2025).
1. Model Framework and Attribute Factorization
A standard 3DGS model represents a scene as splats, each defined by a vector of 59 real-valued attributes:
- : 3D position
- : opacity
- : scaling parameters
- : rotation (covariance)
- : color
- : spherical-harmonic coefficients
NSVQ-GS preserves and in full precision, while the attributes are compressed via four separate codebooks. Each codebook discretizes its respective attribute using codes, where is the bitwidth:
| Attribute | Codebook () | Dimensionality () | Bitwidth () |
|---|---|---|---|
| 3 | |||
| 4 | |||
| 3 | |||
| 45 |
Each splat stores the code indices , i.e., only bits per splat for these attributes (Wang et al., 3 Apr 2025).
2. Differentiable Quantization via Noise Substitution
Hard vector quantization by is non-differentiable due to the discrete assignment. To enable backpropagation, NSVQ replaces the attribute vector by a noisy substitute:
Here is the current attribute vector, is its closest codebook element, and is a random vector. Both and are differentiable with respect to and , thus gradients flow from the loss to both the encoder and the codebook entries. This circumvents the need for a straight-through estimator (Wang et al., 3 Apr 2025).
During training, this mechanism is applied independently to each attribute () via their respective codebooks.
3. Training Objective and Optimization Schedule
The joint optimization combines:
- Reconstruction loss: (per-pixel error between rendered and ground-truth images)
- Opacity regularization: (used for pruning low-opacity splats)
- (Optional) VQ commitment loss: (as in VQ-VAE, to encourage codebook utilization)
The combined loss:
where and control regularization during pruning and codebook stabilization, respectively. In fine-tuning, assignments are frozen and is set to zero (Wang et al., 3 Apr 2025).
The four-phase training schedule is:
- Warm-up: Full precision rendering and latent optimization
- Pruning: Remove low-opacity splats
- Vector quantization: Train with NSVQ and update codebooks
- Fine-tuning: Freeze quantization, optimize only model parameters
4. Pseudocode for NSVQ Training Loop
The training procedure is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
Initialize 3DGS model; initialize codebooks C_s, C_r, C_c, C_sh via K-means. for iter = 1 to 45_000: if iter <= 15_000: # Warm-up render with full-precision Gaussians L ← L_recon elif iter <= 20_000: # Pruning render full-precision L ← L_recon + λ_opacity ⋅ sum(o_i) prune low-opacity splats elif iter <= 43_000: # Vector quantization for each splat i: compute z_s = s_i; nearest code e_s = C_s[k_si] tilde_s_i = NSVQ(z_s, e_s) # Repeat for r, c, sh render with quantized attributes L ← L_recon + β⋅L_{VQ} backpropagate L; update model and codebooks every M batches: replace unused codes else: # Fine-tuning fix code indices; tilde_s_i = e_s render, L ← L_recon update only model parameters |
5. Compression Ratio, Reconstruction Fidelity, and Rendering Speed
NSVQ-GS achieves significant storage savings by storing only code indices and codebooks:
- Original memory: splats × 59 floats × 32 bits
- Compressed: bits for splats, plus codebooks
The compression ratio is defined as:
For NSVQ-GS(16k) (), on Mip-NeRF360:
| Model | PSNR | SSIM | LPIPS | Size | Compression Ratio | FPS (rendering) |
|---|---|---|---|---|---|---|
| NSVQ-GS(16k) | 27.28 | 0.807 | 0.239 | 16.4 MB | 103 | |
| CompGS(16k) | 27.03 | 0.804 | 0.243 | 18 MB | ||
| Baseline 3DGS | — | — | — | 1 GB (734 MB float) | — | 43 |
Rendering throughput approximately doubles after compression, attributed to reduced per-splat data transfer and cache-friendly codebook access (Wang et al., 3 Apr 2025).
6. Codebook Utilization and Gradient Flow
NSVQ’s differentiable formulation enables gradient flow w.r.t. both attributes and codebook vectors. Only active codes receive gradients; hence, to prevent codebook collapse, rarely used codes are periodically replaced by randomly perturbed copies of active codes during training.
This mechanism negates the need for straight-through estimators and ensures joint optimization stability. In fine-tuning, code assignments become fixed and the noise-injection is removed, yielding deterministic attribute decoding at inference (Wang et al., 3 Apr 2025).
7. Compatibility, Deployment, and Practical Implications
The final NSVQ-GS model is a standard list of Gaussians with associated codebooks and per-splat code indices. All inference-time operations—codebook lookup, attribute decoding, and -blending—are compatible with existing 3DGS viewers (CPU or GPU) and do not require auxiliary neural decoders. This design ensures seamless integration with web-based viewers, 3D editors, and SLAM systems. The memory and speed improvements are preconditions for practical deployment in bandwidth- or latency-sensitive environments.
A plausible implication is that NSVQ-GS enables large-scale 3D scene distribution and complex scene rendering with commodity hardware, aligning 3DGS compression performance with industry application requirements (Wang et al., 3 Apr 2025).