P-4DGS: Anchor-Based Predictive Coding
- Anchor-based Predictive Coding (P-4DGS) is a technique that leverages anchor primitives and predictive models to compress 3D and 4D Gaussian Splatting representations efficiently.
- It integrates spatial and temporal prediction, hyperprior-based entropy modeling, and context-aware quantization to optimize rate-distortion performance.
- Extensive evaluations show that P-4DGS achieves superior compression ratios and high-fidelity rendering for both static and dynamic scenes.
Anchor-based Predictive Coding (P-4DGS) refers to a class of rate-distortion optimized, anchor-driven compression techniques for 3D and 4D Gaussian Splatting (3DGS, 4DGS). These methods consistently leverage anchor primitives as compact scene descriptors, employing predictive coding—spatial, temporal, or context-based—to minimize entropy-coded payload. The P-4DGS framework and its variants are the state of the art in 3DGS/4DGS compression, blending anchor feature prediction, hyperprior-based uncertainty modeling, and context-aware quantization. This article surveys fundamental architectures, prediction pipelines, entropy models, and empirical outcomes as documented in recent works (Ma et al., 30 Mar 2025, Wang et al., 11 Oct 2025, Wang et al., 31 May 2024), and related sources.
1. Anchor-Based Scene Representation
The foundational principle in P-4DGS is an anchor-centric scene abstraction. A 3D scene is modeled as a finite collection of anchors, where each anchor comprises:
- Position
- Scale
- learned offsets
- Feature vector
Each anchor "spawns" Gaussians, parameterized on the fly by an MLP:
- , where and encode relative view distance/direction.
During image rendering, -blending of these Gaussians in the camera space yields high-fidelity outputs (Ma et al., 30 Mar 2025, Wang et al., 11 Oct 2025).
2. Predictive Feature Coding and Spatial Conditioning
P-4DGS pioneers a spatial condition-based prediction module to avoid direct transmission of the high-dimensional anchor features . Instead, it harnesses two information streams:
- Multi-resolution hash grid , supplying spatial context
- A compact, learned residual
The feature predictor (FP-Net), a 2-layer MLP, computes:
In transmission, only (the residual) is directly entropy-coded, while derives deterministically from the quantized hash grid and anchor position.
A rate–distortion loss enforces:
- (pixel-level distortion)
- (codelength of and hyperprior )
- (hash grid overhead)
- (regularization for potential pruning/masking) (Ma et al., 30 Mar 2025).
3. Instance- and Context-Aware Entropy Models
P-4DGS introduces sophisticated entropy models for compressing the residuals and anchor attributes.
Hyperprior modeling:
Residuals lack strong spatial prior. An instance-aware hyperprior (with a two-layer MLP encoder) parameterizes the conditional distribution:
Residual is entropy-coded as a discretized Gaussian:
itself is quantized (via a non-parametric CDF) and coded first, enabling decoder-side reconstitution before decoding (Ma et al., 30 Mar 2025).
Contextual entropy coding:
Alternative approaches (e.g., ContextGS) structure anchors into hierarchical levels, with each finer-level anchor predicted autoregressively from already coded, spatially adjacent coarser-level anchors. At the top (coarsest) level, a quantized anchor-specific hyperlatent supplies necessary context. Conditional distributions of all attributes are modeled as convolved Gaussians:
with arithmetic coding of attribute blocks driven by the predicted parameters (Wang et al., 31 May 2024).
4. Extensions for Dynamic (4D) Scene Compression
P-4DGS is not confined to static scenes. For spatiotemporal (4DGS) coding, such as in "P-4DGS: Predictive 4D Gaussian Splatting with 90× Compression", both intra-frame (spatial) and inter-frame (temporal) prediction are deployed (Wang et al., 11 Oct 2025):
- Intra-frame: Anchor-based spatial MLPs predict primitive attributes from anchor features and view parameters.
- Inter-frame: Canonical primitives are temporally deformed by a deformation MLP using positional/time encodings:
Final dynamic primitives are constructed by summing static and deformed parameters.
No explicit spatial or temporal residuals are transmitted; all prediction errors are absorbed in the learned anchor/MLP parameters, which are compressed jointly.
Adaptive quantization and context MLPs govern anchor attribute bitrate, with entropy coding stratified by learned per-anchor contexts from the hash grid. Hash grid transmission is further compressed using gzip- or range-coding, again reflecting learned binary context distributions.
5. Compression Pipeline and Algorithmic Summary
A representative encode/decode pipeline for anchor-based predictive coding is as follows (Ma et al., 30 Mar 2025, Wang et al., 31 May 2024, Wang et al., 11 Oct 2025):
- Encode:
- Compute spatial context and initial prediction.
- Infer and quantize the residual or jointly predict attributes from context/hyperprior.
- Entropy-code all quantized attributes and hyperlatents using arithmetic (or range) coding.
- Decode:
- Decode hash grid (if present), network weights, quantization tables.
- For each anchor, reconstruct contexts from decoded coarser levels or hyperprior, then decode attribute blocks by inverting predictive MLPs and entropy codes.
- Render from reconstructed anchor-derived Gaussians.
Autoregressive pipelines (e.g., ContextGS) require a hierarchical partition of anchors, with deterministic scan order to maintain reproducibility of contexts and reconstruction.
6. Quantitative Performance and Empirical Impact
Extensive evaluations on standard datasets establish P-4DGS and its derivatives as the prevailing state of the art in 3DGS/4DGS compression.
| Method | Size (MB) | PSNR (↑) | SSIM (↑) | LPIPS (↓) | Comment |
|---|---|---|---|---|---|
| 3DGS (uncompressed) | 744.7 | 27.49 | 0.813 | 0.222 | Vanilla, no compression (Wang et al., 31 May 2024) |
| Scaffold-GS | 253.9 | 27.50 | 0.806 | 0.252 | Prior anchor variant |
| HAC | 15.3 | 27.53 | 0.807 | 0.238 | State of the art before P-4DGS |
| ContextGS (low) | 12.7 | 27.62 | 0.808 | 0.237 | Anchor-level context coding (Wang et al., 31 May 2024) |
| P-4DGS (static) | 11.08 | 27.46 | 0.801 | 0.249 | Anchor-based predictive coding (Ma et al., 30 Mar 2025) |
On dynamic scenes, "P-4DGS" reports to compression over D3DGS and 4DGS at comparable or superior rate-distortion:
- Example (D-NeRF, (Wang et al., 11 Oct 2025)): D3DGS: 39.45 MB, PSNR 36.28 Ours: 1.04 MB, PSNR 38.10
P-4DGS variants routinely match or exceed previous methods in PSNR, SSIM, and LPIPS. Rendering speeds reach 250–270 FPS on high-end GPUs (Wang et al., 11 Oct 2025). Models are robust across real and synthetic datasets, including MipNeRF360, Tanks and Temples, and DeepBlending.
7. Comparative Methodologies and Variants
Multiple frameworks have extended or instantiated P-4DGS strategies:
- Spatial Condition-Based Prediction (P-4DGS proper): Leverages hash grid and learned residuals with a hyperprior instance-aware entropy model (Ma et al., 30 Mar 2025).
- Hierarchical ContextGS: Applies autoregressive context modeling of anchors at varying levels, using quantized hyperlatents to bootstrap the coarsest level (Wang et al., 31 May 2024).
- Predictive 4DGS (P-4DGS dynamic): Unifies spatial and temporal anchor-based prediction for dynamic/4D scenes, jointly compressing anchor attributes, deformation, and MLP weights using learned context- and entropy models (Wang et al., 11 Oct 2025).
- ADC-GS: Employs hierarchical decomposition and rate-distortion optimized anchor assignment to further reduce temporal coding redundancy, anchoring all deformation and coding in a canonical anchor space (Huang et al., 13 May 2025).
Each variant balances storage (MB-scale footprint), rate-distortion, rendering speed, and flexibility for static and dynamic scenes. Metrics such as PSNR, SSIM, and LPIPS are commonly adopted for benchmarking.
Anchor-based Predictive Coding (P-4DGS) frameworks, via spatial and temporal predictive modeling, quantized anchors, and nonlinear context/hyperprior-driven entropy models, constitute the leading paradigm for highly compressed Gaussian splatting representations in 3D/4D scene capture and rendering (Ma et al., 30 Mar 2025, Wang et al., 11 Oct 2025, Wang et al., 31 May 2024, Huang et al., 13 May 2025).