Papers
Topics
Authors
Recent
2000 character limit reached

P-4DGS: Anchor-Based Predictive Coding

Updated 8 December 2025
  • Anchor-based Predictive Coding (P-4DGS) is a technique that leverages anchor primitives and predictive models to compress 3D and 4D Gaussian Splatting representations efficiently.
  • It integrates spatial and temporal prediction, hyperprior-based entropy modeling, and context-aware quantization to optimize rate-distortion performance.
  • Extensive evaluations show that P-4DGS achieves superior compression ratios and high-fidelity rendering for both static and dynamic scenes.

Anchor-based Predictive Coding (P-4DGS) refers to a class of rate-distortion optimized, anchor-driven compression techniques for 3D and 4D Gaussian Splatting (3DGS, 4DGS). These methods consistently leverage anchor primitives as compact scene descriptors, employing predictive coding—spatial, temporal, or context-based—to minimize entropy-coded payload. The P-4DGS framework and its variants are the state of the art in 3DGS/4DGS compression, blending anchor feature prediction, hyperprior-based uncertainty modeling, and context-aware quantization. This article surveys fundamental architectures, prediction pipelines, entropy models, and empirical outcomes as documented in recent works (Ma et al., 30 Mar 2025, Wang et al., 11 Oct 2025, Wang et al., 31 May 2024), and related sources.

1. Anchor-Based Scene Representation

The foundational principle in P-4DGS is an anchor-centric scene abstraction. A 3D scene is modeled as a finite collection of NN anchors, where each anchor nn comprises:

  • Position xnR3x_n \in \mathbb{R}^3
  • Scale nR3\ell_n \in \mathbb{R}^3
  • kk learned offsets {oniR3}i=0k1\{o_n^i \in \mathbb{R}^3\}_{i=0\ldots k-1}
  • Feature vector fnRDf_n \in \mathbb{R}^D

Each anchor "spawns" kk Gaussians, parameterized on the fly by an MLP:

  • μni=xn+onin\mu_n^i = x_n + o_n^i \odot \ell_n
  • [cni,rni,sni,αni]=MLPrender(fn,σci,dci)[c_n^i, r_n^i, s_n^i, \alpha_n^i] = \operatorname{MLP}_{\mathrm{render}}(f_n, \sigma_c^i, d_c^i), where σci\sigma_c^i and dcid_c^i encode relative view distance/direction.

During image rendering, α\alpha-blending of these Gaussians in the camera space yields high-fidelity outputs (Ma et al., 30 Mar 2025, Wang et al., 11 Oct 2025).

2. Predictive Feature Coding and Spatial Conditioning

P-4DGS pioneers a spatial condition-based prediction module to avoid direct transmission of the high-dimensional anchor features fnf_n. Instead, it harnesses two information streams:

  • Multi-resolution hash grid H(xn)H(x_n), supplying spatial context fc,n=H(xn)f_{c,n} = H(x_n)
  • A compact, learned residual fr,nf_{r,n}

The feature predictor (FP-Net), a 2-layer MLP, computes:

fp,n=P([  fc,n;fr,n  ])f_{p,n} = P([\; f_{c,n}\, ;\, f_{r,n}\;])

In transmission, only fr,nf_{r,n} (the residual) is directly entropy-coded, while fc,nf_{c,n} derives deterministically from the quantized hash grid and anchor position.

A rate–distortion loss enforces:

  • LscaffoldL_{\text{scaffold}} (pixel-level distortion)
  • LentropyL_{\text{entropy}} (codelength of {n,on,rn}\{\ell_n, o_n, r_n\} and hyperprior znz_n)
  • LhashL_{\text{hash}} (hash grid overhead)
  • LmaskL_{\text{mask}} (regularization for potential pruning/masking) (Ma et al., 30 Mar 2025).

3. Instance- and Context-Aware Entropy Models

P-4DGS introduces sophisticated entropy models for compressing the residuals and anchor attributes.

Hyperprior modeling:

Residuals rnr_n lack strong spatial prior. An instance-aware hyperprior zn=E(rn)z_n = E(r_n) (with a two-layer MLP encoder) parameterizes the conditional distribution:

[μr,n,σr,n,qr,n]=M([zn;fc,n])[\mu_{r,n}, \sigma_{r,n}, q_{r,n}] = M([z_n ; f_{c,n}])

Residual rnr_n is entropy-coded as a discretized Gaussian:

p(rnzn,fc,n)=j=1drNdiscrete(rn,j;μr,n,j,σr,n,j,qr,n,j)p(r_n | z_n, f_{c,n}) = \prod_{j=1}^{d_r} \mathcal{N}_{\mathrm{discrete}}(r_{n,j}; \mu_{r,n,j}, \sigma_{r,n,j}, q_{r,n,j})

znz_n itself is quantized (via a non-parametric CDF) and coded first, enabling decoder-side reconstitution before decoding rnr_n (Ma et al., 30 Mar 2025).

Contextual entropy coding:

Alternative approaches (e.g., ContextGS) structure anchors into hierarchical levels, with each finer-level anchor predicted autoregressively from already coded, spatially adjacent coarser-level anchors. At the top (coarsest) level, a quantized anchor-specific hyperlatent ziz_i supplies necessary context. Conditional distributions of all attributes are modeled as convolved Gaussians:

μik,σik,Δik=Fk(ψik)\mu_i^k, \sigma_i^k, \Delta_i^k = F^k(\psi_i^k)

with arithmetic coding of attribute blocks driven by the predicted parameters (Wang et al., 31 May 2024).

4. Extensions for Dynamic (4D) Scene Compression

P-4DGS is not confined to static scenes. For spatiotemporal (4DGS) coding, such as in "P-4DGS: Predictive 4D Gaussian Splatting with 90× Compression", both intra-frame (spatial) and inter-frame (temporal) prediction are deployed (Wang et al., 11 Oct 2025):

  • Intra-frame: Anchor-based spatial MLPs predict primitive attributes from anchor features and view parameters.
  • Inter-frame: Canonical primitives are temporally deformed by a deformation MLP using positional/time encodings:

(Δxa,i,Δsa,i,Δra,i)=ψd(concat(E(xa,i),E(t)))(\Delta x_{a,i}, \Delta s_{a,i}, \Delta r_{a,i}) = \psi_d(\operatorname{concat}(\mathcal{E}(x_{a,i}), \mathcal{E}(t)))

Final dynamic primitives are constructed by summing static and deformed parameters.

No explicit spatial or temporal residuals are transmitted; all prediction errors are absorbed in the learned anchor/MLP parameters, which are compressed jointly.

Adaptive quantization and context MLPs govern anchor attribute bitrate, with entropy coding stratified by learned per-anchor contexts from the hash grid. Hash grid transmission is further compressed using gzip- or range-coding, again reflecting learned binary context distributions.

5. Compression Pipeline and Algorithmic Summary

A representative encode/decode pipeline for anchor-based predictive coding is as follows (Ma et al., 30 Mar 2025, Wang et al., 31 May 2024, Wang et al., 11 Oct 2025):

  • Encode:
    • Compute spatial context and initial prediction.
    • Infer and quantize the residual or jointly predict attributes from context/hyperprior.
    • Entropy-code all quantized attributes and hyperlatents using arithmetic (or range) coding.
  • Decode:
  1. Decode hash grid (if present), network weights, quantization tables.
  2. For each anchor, reconstruct contexts from decoded coarser levels or hyperprior, then decode attribute blocks by inverting predictive MLPs and entropy codes.
  3. Render from reconstructed anchor-derived Gaussians.

Autoregressive pipelines (e.g., ContextGS) require a hierarchical partition of anchors, with deterministic scan order to maintain reproducibility of contexts and reconstruction.

6. Quantitative Performance and Empirical Impact

Extensive evaluations on standard datasets establish P-4DGS and its derivatives as the prevailing state of the art in 3DGS/4DGS compression.

Method Size (MB) PSNR (↑) SSIM (↑) LPIPS (↓) Comment
3DGS (uncompressed) 744.7 27.49 0.813 0.222 Vanilla, no compression (Wang et al., 31 May 2024)
Scaffold-GS 253.9 27.50 0.806 0.252 Prior anchor variant
HAC 15.3 27.53 0.807 0.238 State of the art before P-4DGS
ContextGS (low) 12.7 27.62 0.808 0.237 Anchor-level context coding (Wang et al., 31 May 2024)
P-4DGS (static) 11.08 27.46 0.801 0.249 Anchor-based predictive coding (Ma et al., 30 Mar 2025)

On dynamic scenes, "P-4DGS" reports 40×\sim 40\times to 90×90\times compression over D3DGS and 4DGS at comparable or superior rate-distortion:

P-4DGS variants routinely match or exceed previous methods in PSNR, SSIM, and LPIPS. Rendering speeds reach 250–270 FPS on high-end GPUs (Wang et al., 11 Oct 2025). Models are robust across real and synthetic datasets, including MipNeRF360, Tanks and Temples, and DeepBlending.

7. Comparative Methodologies and Variants

Multiple frameworks have extended or instantiated P-4DGS strategies:

  • Spatial Condition-Based Prediction (P-4DGS proper): Leverages hash grid and learned residuals with a hyperprior instance-aware entropy model (Ma et al., 30 Mar 2025).
  • Hierarchical ContextGS: Applies autoregressive context modeling of anchors at varying levels, using quantized hyperlatents to bootstrap the coarsest level (Wang et al., 31 May 2024).
  • Predictive 4DGS (P-4DGS dynamic): Unifies spatial and temporal anchor-based prediction for dynamic/4D scenes, jointly compressing anchor attributes, deformation, and MLP weights using learned context- and entropy models (Wang et al., 11 Oct 2025).
  • ADC-GS: Employs hierarchical decomposition and rate-distortion optimized anchor assignment to further reduce temporal coding redundancy, anchoring all deformation and coding in a canonical anchor space (Huang et al., 13 May 2025).

Each variant balances storage (MB-scale footprint), rate-distortion, rendering speed, and flexibility for static and dynamic scenes. Metrics such as PSNR, SSIM, and LPIPS are commonly adopted for benchmarking.


Anchor-based Predictive Coding (P-4DGS) frameworks, via spatial and temporal predictive modeling, quantized anchors, and nonlinear context/hyperprior-driven entropy models, constitute the leading paradigm for highly compressed Gaussian splatting representations in 3D/4D scene capture and rendering (Ma et al., 30 Mar 2025, Wang et al., 11 Oct 2025, Wang et al., 31 May 2024, Huang et al., 13 May 2025).

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Anchor-based Predictive Coding (P-4DGS).