Smol-GS: Compact 3D Gaussian Splatting

Updated 7 December 2025

Smol-GS is a compact 3D scene encoding technique that uses Gaussian splats with a lossless occupancy-octree for spatial compression and learned quantized features for appearance.
It achieves up to 155× compression over traditional 3DGS methods while maintaining competitive rendering quality, with real-time speeds of 200–400 fps.
The method decouples spatial and feature data to support downstream tasks such as semantic labeling, robotic navigation, and efficient scene editing.

Smol-GS is a method for highly compact 3D scene encoding based on the 3D Gaussian Splatting (3DGS) paradigm, achieving state-of-the-art compression ratios while retaining visual fidelity and enabling downstream machine-learning and robotic applications. It combines a lossless spatial hierarchy for Gaussian coordinates with learned, quantized abstract per-splat attributes, providing a memory-efficient and semantically enhanced representation suitable for demanding real-time and mobile scenarios (Wang et al., 30 Nov 2025).

1. Motivation and Problem Setting

3D Gaussian Splatting models a scene as a collection of Gaussian “splats” in $\mathbb{R}^3$ , each with associated geometric and appearance parameters. Typical high-quality reconstructions require millions of splats, yielding model sizes of hundreds of megabytes to gigabytes. This precludes efficient streaming, mobile inference, or storage-constrained deployment. Prior approaches focusing solely on attribute quantization or anchor-offset schemes fail to deliver sufficient storage savings or introduce spatial redundancies. Smol-GS responds to these deficits by:

Explicitly compressing splat coordinates using a recursive occupancy-octree
Abstracting appearance/material cues per-splat and entropy-coding them
Decoupling spatial and feature compression to support editing, sparse access, and downstream analysis The design specifically targets practical applications such as robotics, web-based visualization, and downstream scene understanding, where model size and semantic manipulability are both critical (Wang et al., 30 Nov 2025).

2. Mathematical Foundations

Each 3D splat $G_i$ is parameterized by:

Mean $\boldsymbol{\mu}_i\in\mathbb{R}^3$
Covariance $\Sigma_i\in\mathbb{R}^{3\times 3}$
Opacity $o_i\in[0,1]$
Learned abstract feature vector $f_i\in\mathbb{R}^{n_f}$ (practically $n_f=8$ ) The density formulation is: $G_i(\mathbf{x}) = \exp\left(-\frac{1}{2}(\mathbf{x}-\boldsymbol{\mu}_i)^\top \Sigma_i^{-1}(\mathbf{x}-\boldsymbol{\mu}_i)\right)$ Projected 2D Gaussians along camera rays are composited via ordered $\alpha$ -blending for view synthesis. Rendering reduces to evaluating the composited sum along each ray, where features $f_i$ are decoded by compact multi-layer perceptrons (MLPs) into color $c_i$ , rotation $r_i$ , scale $s_i$ , and opacity $o_i$ (Wang et al., 30 Nov 2025).

3. Representation Architecture

3.1 Occupancy-Octree Coding for Coordinates

The spatial support consists of a recursively subdivided axis-aligned bounding box (AABB), representing each split as an 8-bit occupancy byte per internal node. Only nonempty octants are recursively subdivided, and leaf nodes correspond to individual splat locations. Storing the sequence of occupancy bytes via entropy coding (e.g., Huffman) achieves coordinate compression:

For $N$ splats and $N_{\rm int}$ internal nodes, total bits $\approx 8N_{\rm int}\ll3RN$ for depth $R$
Empirical coordinate bits per splat: $b_{\rm coord}\approx1.4$ bytes (MIP-NeRF 360) This lossless structure ensures spatial queries and manipulation remain feasible and efficient.

3.2 Quantization and Arithmetic-Coding of Feature Vectors

Each splat’s feature vector $f_i$ (encoding color, opacity, geometry, etc.) is quantized via learned step sizes predicted from the hashed spatial index of $\boldsymbol{\mu}_i$ : $(\mu_i^f,\,E_i^f,\,A_i)=\mathrm{MLP}_h\left(\mathrm{Hash}(\boldsymbol{\mu}_i)\right)$ Quantized binning is

$f_{i,q} = A_i \circ \mathrm{round}(f_i/A_i)$

Probability distributions for arithmetic coding are based on predicted Gaussians

$p(f_{i,q}) \propto \mathcal{N}\left(f_{i,q}; \mu_i^f, \mathrm{diag}(E_i^f)\right)$

Only the quantized features and compact MLP weights are stored, yielding $b_{\rm feat}\approx3.2$ bytes per splat (Wang et al., 30 Nov 2025).

3.3 Overall Memory Model

For $N$ splats, total model size (excluding MLP weights) is

$M = N\; (b_{\rm coord} + b_{\rm feat})\ \mathrm{bits}$

which, empirically, yields $M\approx 4.75$ MB for standard real-world scenes (MIP-NeRF 360).

4. Training, Compaction, and Encoding Strategy

The Smol-GS pipeline consists of the following algorithmic stages (35k iterations total):

Warm-Up (0–0.5k): Initialize splats from SfM point clouds
Densification (0.5–15k): Adaptive splitting/pruning based on $\|\nabla_{x_i}\mathcal{L}_1\|$ to match scene detail
Compaction (15–20k): Prune excess splats via opacity penalty $\lambda_o$
Feature Compression (20–30k): Activate quantization and NLL penalty $\lambda_q$ for $f_i$ , $s_i$
Coordinate Compression (30–35k): Fix splits, encode octree

The global loss combines photometric $\ell_1$ and SSIM loss, opacity sparsity, and negative log-likelihoods of feature quantization: $\mathcal L = (1-\alpha_s)\,\mathcal L_1 + \alpha_s\,\mathcal L_{\rm SSIM} + \lambda_o \sum_i o_i + \lambda_q\, \frac{1}{N} \sum_{i=1}^N [\mathrm{NLL}(f_{i,q}) + \mathrm{NLL}(s_{i,q})]$ Pseudocode for the key algorithms—building the occupancy-octree and encoding features via arithmetic coding—are explicitly included in the reference [(Wang et al., 30 Nov 2025), Sec. 4.3].

5. Benchmarking, Comparison, and Quantitative Results

Smol-GS is benchmarked on MIP-NeRF 360, Tanks & Temples, and Deep Blending. The following table summarizes performance for MIP-NeRF 360:

Method	PSNR↑	SSIM↑	LPIPS↓	Size (MB)	Compression Ratio
3DGS-30K	27.21	0.815	0.214	734.0	1×
HAC++	27.60	0.803	0.253	8.74	84×
Smol-GS (small)	27.29	0.798	0.260	4.75	155×

Compression ratio is defined as $S_{\rm orig} / S_{\rm compr}$ . Smol-GS achieves up to 155 $\times$ compression over vanilla 3DGS-30K at matched rendering quality. Other metrics:

Training time: $\approx$ 32 min/scene (NVIDIA H200)
Encoding: 1–4 s/scene
Real-time rendering: 200–400 fps [(Wang et al., 30 Nov 2025), Table 1; Sec. 5.4]

6. Visual and Semantic Analysis; Downstream Applications

Figures 2 and 8 of (Wang et al., 30 Nov 2025) exhibit that Smol-GS faithfully reconstructs sharp edges, specular reflections, and transparencies at drastic (order-of-magnitude) reductions in model size. In challenging regions (e.g., stainless and glass surfaces), learned per-splat features offer better expressivity than standard spherical harmonics at a lower representation cost.

The discrete occupancy-octree forms an explicit spatial data structure enabling occupancy queries necessary for navigation and collision avoidance. Because attributes are decoupled and accessible, Smol-GS supports splat-wise semantic labeling, scene graph reasoning, and potentially forms a basis for SLAM, planning, and 3D scene understanding pipelines. This suggests utility not only as a rendering primitive but as a unified geometric/semantic abstraction layer for embodied or interactive AI.

7. Comparative Perspective and Research Context

Smol-GS is distinct from prior methods such as LocoGS, Mini-Splatting, OMG, Scaffold-GS, and HAC++ in several ways:

OMG (Lee et al., 21 Mar 2025) and its variants focus on attribute-level quantization, neural field compression, and importance-guided pruning—reducing, but not eliminating, coordinate redundancy or anchor-offset overhead.
HAC++ and Scaffold-GS reduce local redundancy but are averse to coordinate compression due to fidelity concerns.
Smol-GS consolidates the spatial hierarchy using a lossless occupancy-octree and performs per-splat, spatially conditioned feature quantization, achieving higher compression and enabling explicit geometric/semantic manipulations. A plausible implication is that occupancy-octree coordinate compression and learned semantic features facilitate hybrid use cases spanning rendering and scene understanding without bespoke retraining or expansion of storage footprint.

References:

"Smol-GS: Compact Representations for Abstract 3D Gaussian Splatting" (Wang et al., 30 Nov 2025)
"Optimized Minimal 3D Gaussian Splatting" (Lee et al., 21 Mar 2025)

PDF Markdown Chat (Pro)

References (2)

Smol-GS: Compact Representations for Abstract 3D Gaussian Splatting (2025)

Optimized Minimal 3D Gaussian Splatting (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Smol-GS.